XML Sitemap Validator

Validate your XML sitemap against the sitemaps.org protocol. Check for errors, analyze URL distribution, and get a health score.

TL;DR: A broken sitemap means Google can't find your pages efficiently. This free sitemap validator checks your XML sitemap for structural errors, missing elements, and protocol violations. Get a health score, spot-check URL availability, and see exactly what needs fixing before it costs you rankings.

What Is an XML Sitemap?

An XML sitemap is a file that lists the URLs on your website that you want search engines to crawl and index. Think of it as a roadmap you hand directly to Google, Bing, and other search engines. Instead of relying on crawlers to find every page by following links, you give them a complete list.

The format follows the sitemaps.org protocol, an open standard adopted by all major search engines. A basic sitemap is a simple XML file with a list of URLs wrapped in specific tags. It can also include optional metadata like when each page was last modified and how frequently it changes.

Most CMS platforms generate sitemaps automatically. WordPress, Shopify, Wix, and similar platforms all create them. But automatic does not mean correct. Plugins break, settings change, and edge cases create invalid entries. That is why validation matters.

Why Sitemap Validation Matters

Your XML sitemap is a direct communication channel with search engines. When it works, Google discovers and indexes your pages faster. When it is broken, Google wastes crawl budget on errors instead of indexing your content.

The tricky part is that sitemaps fail silently. Google will not email you when it encounters a broken sitemap. It simply stops using it. You might not notice for weeks or months, and by then, new pages are not getting indexed and old pages are not getting re-crawled.

Regular validation catches problems early. Run your sitemap through this validator after any major site change: CMS updates, plugin changes, URL restructuring, or large content additions.

Common Sitemap Errors

Most sitemap issues fall into a few predictable categories. Here are the ones this validator checks for.

  • Invalid XML syntax. Missing closing tags, incorrect nesting, or encoding issues will make the entire file unreadable to search engines.
  • Missing namespace declaration. The root element must include the correct sitemaps.org xmlns. Without it, parsers may reject the file.
  • Relative URLs. Every URL in a sitemap must be absolute (starting with https://). Relative paths like /about/ are not valid.
  • URLs returning errors. Pages that return 404, 500, or redirect status codes should not be in your sitemap. They waste crawl budget.
  • Exceeding size limits. A single sitemap file cannot contain more than 50,000 URLs or exceed 50MB when uncompressed.
  • Incorrect lastmod format. Dates must follow the W3C datetime format. The most common format is YYYY-MM-DD. Timestamps like "last Tuesday" or non-standard formats will be ignored.
  • Invalid changefreq or priority values. While Google largely ignores these fields, invalid values can cause parsing errors that affect the rest of the file.

What This Validator Checks

Check Description
XML Syntax Valid XML structure, proper encoding, correct nesting
Namespace Correct sitemaps.org xmlns declaration
URL Validation Every <loc> element has a valid, absolute URL
Size Limits Under 50,000 URLs and 50MB file size
Lastmod Format Dates follow W3C datetime format (YYYY-MM-DD)
Changefreq Values Valid values: always, hourly, daily, weekly, monthly, yearly, never
Priority Range Values between 0.0 and 1.0
URL Spot Check Random sample of URLs checked for HTTP 200 response

Understanding Sitemap Tags

The sitemaps.org protocol defines four tags for each URL entry. Only one is required. The rest are optional but can provide useful signals.

loc (required)

The full URL of the page. Must be absolute, properly encoded, and start with the same protocol (http or https) as your site. This is the only required tag.

lastmod (recommended)

The date the page was last modified. Google actively uses this value to decide when to re-crawl pages. Make sure the dates are accurate. Setting every page to today's date defeats the purpose and teaches Google to ignore your lastmod values entirely.

changefreq (optional)

A hint about how often the page changes. Valid values are: always, hourly, daily, weekly, monthly, yearly, and never. Google has publicly stated that it ignores this tag. It determines crawl frequency based on its own observations. You can include it, but do not rely on it.

priority (optional)

A value between 0.0 and 1.0 indicating the relative importance of a page compared to other pages on your site. Google also ignores this tag. It was a good idea in theory, but site owners tend to set everything to 1.0, which makes the signal meaningless.

Sitemap Index Files

If your site has more than 50,000 URLs, you cannot fit them all in a single sitemap. The solution is a sitemap index file. This is an XML file that references multiple child sitemaps. Each child sitemap follows the same 50,000-URL and 50MB limits.

Even sites with fewer than 50,000 URLs often use sitemap indexes to organize URLs by type. For example, you might have separate sitemaps for blog posts, product pages, and category pages. This makes it easier to monitor indexing rates for each content type in Google Search Console.

How Google Uses Your Sitemap

Google treats your sitemap as a suggestion, not a directive. Including a URL in your sitemap does not guarantee it will be indexed. And excluding a URL does not prevent indexing. Google decides what to index based on quality, relevance, and demand.

What sitemaps do help with is discovery. New pages get found faster. Updated pages get re-crawled sooner when the lastmod date changes. And large sites with complex architectures benefit from having a clear URL inventory that Google can reference.

Google also uses sitemaps to learn about alternate language versions of pages (hreflang), video content, news articles, and image locations through sitemap extensions.

How to Submit Your Sitemap to Google Search Console

  1. Log in to Google Search Console and select your property.
  2. Click "Sitemaps" in the left sidebar under the Indexing section.
  3. Enter your sitemap URL (typically /sitemap.xml) in the "Add a new sitemap" field.
  4. Click "Submit." Google will fetch and process the sitemap. Check back after a few days to see how many URLs were discovered and how many were indexed.

You can also reference your sitemap in your robots.txt file with a Sitemap directive. This allows any search engine crawler to discover it automatically without manual submission.

Sitemap Best Practices

  • Only include indexable pages. Every URL should return a 200 status code and be the canonical version. Do not include pages with noindex tags, redirects, or error codes.
  • Keep accurate lastmod dates. Only update the date when the page content actually changes. Artificially inflating dates erodes trust with Google.
  • Use sitemap index files for large sites. Stay under the 50,000-URL and 50MB limits per file. Organize by content type for easier monitoring.
  • Submit to Search Console. Do not rely on robots.txt alone. Explicit submission ensures faster processing and gives you indexing data.
  • Automate updates. Dynamic sitemaps that regenerate when content changes are always better than static files that go stale.
  • Validate after every major change. CMS updates, plugin changes, and URL restructuring can silently break your sitemap.

Frequently Asked Questions

Where should my sitemap be located?

The standard location is yoursite.com/sitemap.xml. You should also reference it in your robots.txt file with a Sitemap directive. Google can find sitemaps at any URL, but the root location is the convention that tools and crawlers check first.

Does every website need a sitemap?

Small sites (under 500 pages) with good internal linking can get by without one. But there is no downside to having a sitemap, and it helps Google find pages that might not be well-linked internally. For larger sites and new sites without many backlinks, sitemaps are essential.

Should I include images and videos in my sitemap?

Google supports image and video sitemap extensions. If your site relies heavily on visual content like product photos or video tutorials, including these can improve visibility in image and video search results. For most text-focused sites, a standard URL sitemap is sufficient.

How often should I regenerate my sitemap?

Ideally, your sitemap updates automatically whenever you publish or modify content. If you generate it manually, update it whenever you add or remove pages. At minimum, regenerate monthly. Stale sitemaps with dead URLs waste crawl budget and send bad signals.

What is the difference between a sitemap and a sitemap index?

A sitemap lists individual page URLs. A sitemap index lists other sitemap files. Large sites use a sitemap index to organize URLs into multiple smaller sitemaps by category, date, or content type. Each child sitemap must follow the same 50,000-URL and 50MB limits.

Does Google actually use changefreq and priority?

Google has stated publicly that it largely ignores changefreq and priority values. It determines crawl frequency based on its own observations of how often a page changes. However, lastmod is used and valued, so make sure those dates are accurate.

Can a broken sitemap hurt my rankings?

A broken sitemap will not directly penalize your rankings. But it can slow down discovery of new content, delay re-crawling of updated pages, and waste crawl budget on error URLs. Over time, these indirect effects can impact your organic performance, especially on large sites.

Related Free SEO Tools