What is an XML sitemap and why should you have one?
A good XML sitemap acts as a roadmap of your website that leads Google to all your important pages. XML sitemaps can be good for SEO, allowing Google to find your essential pages quickly, even if your internal linking isn’t perfect. This post explains what they are and how they help you rank better.
What are XML sitemaps?
An XML sitemap is a file that lists a website’s essential pages, making sure Google can find and crawl them all. It also helps search engines understand your website structure. You want Google to crawl every important page of your website. But sometimes, pages end up without internal links, making them hard to find. A sitemap can help speed up content discovery.
Looking to expand your knowledge of technical SEO? We have a course in the Yoast SEO Academy focusing on crawlability and indexability. One of the topics we tackle is how to use XML sitemaps properly.
What does an XML sitemap look like?
An XML sitemap offers a standardized way of listing posts and pages, making them discoverable for search engines. Here’s a very simple example: a sitemap with a single URL:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://www.yoast.com/wordpress-seo/</loc>
<lastmod>2024-01-01</lastmod>
</url>
</urlset>
It consists of a couple of parts:
- An XML version declaration: which search engine crawlers use to determine the file type they read.
- The URL set tells search engines about the protocol.
- The URL: lists the URL of the page.
- Lastmod: a date format describing when the page was last modified.
Every sitemap needs to follow this standard to be valid. Other properties like <priority>
and <changefreq>
don’t affect the workings or performance of the sitemap.
The importance of lastmod
Google and Bing have recognized the usefulness of the lastmod
feature. Fabrice Canel from Microsoft Bing says including the <lastmod>
tag in your sitemap is crucial. Gary Illyes from Google says:
“The
<lastmod>
element in sitemaps is a signal that can help crawlers figure out how often to crawl your pages.”
In its XML sitemap documentation, Google says:
“Google uses the
<lastmod>
value if it’s consistently and verifiably (for example by comparing to the last modification of the page) accurate.”
Google also explains how to view lastmod
dates for page updates:
“The value should reflect the date and time of the last significant update to the page. For example, an update to the main content, the structured data, or links on the page is generally considered significant, however an update to the copyright date is not.”
An example of an XML sitemap
Let’s take a look at an example. Below, you’ll see a screenshot of the base XML sitemap of yoast.com. You can see all the different sitemaps on yoast.com. You’ll notice a date at the end of each line. This tells Google when each post was last updated and helps with SEO because you want Google to crawl your updated content as soon as possible. When a date changes in the sitemap, Google knows new content exists to crawl and index.
As you can see, the Yoast.com XML sitemap shows several ‘index’ sitemaps: post-sitemap.xml, page-sitemap.xml, video-sitemap.xml, etc. This categorization makes a site’s structure as straightforward as possible. So, if you click on one of the index sitemaps, you’ll see all the URLs in that particular sitemap. For example, if you click on post-sitemap.xml
you’ll see all Yoast.com’s post URLs.
Sometimes, splitting an index sitemap is necessary if you have a huge website. A single XML sitemap is limited to 50,000 URLs and can have a file size of up to 50 MB. If your website has over 50,000 posts, you’ll need two separate ones for the post URLs, effectively adding a second index sitemap. The Yoast SEO plugin sets the limit even lower — at 1.000 URLs — to keep your sitemap loading as fast as possible.
What websites need an XML sitemap?
Google’s documentation says sitemaps are beneficial for “really large websites,” “websites with large archives,” “new websites with just a few external links to them,” and “websites which use rich media content.” According to Google, proper internal linking should allow it to find all your content easily. Unfortunately, many sites do not properly link their content logically.
While we agree that these websites will benefit the most from having one, at Yoast, we think XML sitemaps benefit every website. As the web grows, it’s getting harder and harder to index sites properly. That’s why you should provide search engines with every available option to have it found. In addition, XML sitemaps make the crawling process more efficient and greener for search engines.
Every website needs Google to find essential pages easily and know when they were last updated. That’s why this feature is included in the Yoast SEO plugin.
Which pages should be in your XML sitemap?
How do you decide which pages to include in your XML sitemap? Always start by thinking of the relevance of a URL: when a visitor lands on a particular URL, is it a good result? Do you want visitors to land on that URL? If not, it probably shouldn’t be in it. However, if you don’t want that URL to appear in the search results, you must add a ‘noindex’ tag. Leaving it out of your sitemap doesn’t mean Google won’t index the URL. If Google can find it by following links, Google can index the URL.
Example: A new blog
For example, you are starting a new blog. Of course, you want to ensure your target audience can find your blog posts in the search results. So, it’s a good idea to immediately include your posts in your XML sitemap. It’s safe to assume that most of your pages will also be relevant results for your visitors. However, a thank you page that people will see after they’ve subscribed to your newsletter is not something you want to appear in the search results. In this case, you don’t want to exclude all pages from your sitemap, only this one.
Let’s stay with the example of the new blog. In addition to your blog posts, you create some categories and tags. These categories and tags will have archive pages that list all posts in that specific category or tag. However, initially, there might not be enough content to fill these archive pages, making them ‘thin content’. For example, tag archives that show just one post are not that valuable to visitors yet. You can exclude them from the sitemap when starting your blog and include them once you have enough posts. You can even exclude all your tag pages or category pages simultaneously using Yoast SEO.
However, this kind of page could also be excellent ranking material. So, if you think: well, yes, this tag page is a bit ‘thin’ right now, but it could be a great landing page, then enrich it with additional information and images. And don’t exclude it from your sitemap in this case.
How to make Google find your sitemap
If you want Google to find your XML sitemap quicker, you’ll need to add it to your Google Search Console account. You can find your sitemaps in the ‘Sitemaps’ section. If not, you can add your sitemap at the top of the page.
Adding your sitemap helps check whether Google indexed all pages in it. We recommend looking into this further if there is a big difference in the ‘submitted’ and ‘indexed’ numbers on a particular sitemap. Maybe there’s an error that prevents some pages from indexing? Another option is to add more links pointing to content that has not yet been indexed.
How to add XML sitemaps to your site with Yoast SEO
Because of their SEO value, we’ve added the ability to create your XML sitemaps in our Yoast SEO plugin. They are available in both the free and Premium versions of the plugin. We make them automatically for you, placing them in the right place. You don’t have to worry about where your XML sitemap should be placed and about optimizing it for search engines.
While we worked with Google to bring XML sitemaps natively to WordPress, we offer a superior version of sitemaps in Yoast SEO. The WordPress one is basic, not nearly as fine-tuned, and fully featured as the one in Yoast SEO. If you install Yoast SEO, we automatically disable the WordPress sitemap for you.
Yoast SEO creates an XML sitemap for your website automatically. Click on ‘SEO’ in the sidebar of your WordPress install and then select the ‘Site features’ tab below General. Scroll down to the APIs section:
In this screen, you can enable or disable the different XML sitemaps for your website by using the slide button below the feature. Also, you can click ‘Learn more’ to read up on exactly what an XML sitemap is and why you should need one. Click on ‘View the XML sitemap’ to view your website’s XML sitemap.
How to exclude content types from your XML sitemap
You can exclude content types from your XML sitemap in the Yoast SEO settings ‘Content types’ section. Click on the content type you want to exclude (for example, posts) and use the slider button next to ‘show posts in search results’ to disable it. If you do, this content won’t be included in your XML sitemap.
This doesn’t mean we recommend excluding your posts and pages from your XML sitemap. But you are in control of what content types show up in your XML sitemap. You can also do this for individual posts and pages by going to the Advanced settings in the Yoast SEO meta box or sidebar and selecting an option in the ‘Allow search engines to show this Post/Page in search results?’ Want to know more about when and why you should exclude certain content from your XML sitemap? Read our post on indexing in Yoast SEO: what pages to show in Google’s search results.
Frequently asked questions about XML sitemaps
There are a lot of questions regarding XML sitemaps, so we’ve answered a couple in the FAQ below.
If the XML sitemap isn’t valid or search engines can’t read it correctly, you must find out what type of error is listed. If search engines aren’t reading the XML sitemap, ensure it’s submitted to the search engine webmaster tools. If it’s not valid check the errors and find the specific solutions to your issue.
In most cases, you can find out if sites have an XML sitemap by adding sitemap.xml to a root domain. So, that would be example.com/sitemap.xml. If a site has Yoast SEO installed, you’ll notice that it’s redirected to example.com/sitemap_index.xml. Sitemap_index.xml is the base sitemap that collects all the sitemaps on your site on one page.
There are ways to make and update your sitemaps by hand, but you shouldn’t do that. Also, there are static generators out there that help you make a sitemap at any given moment. But, again, this process would need to repeat itself every time you add or update content. The best way to do this is by simply using Yoast SEO. Turn on the XML sitemap in Yoast SEO and site back — all your updates will automatically happen.
In the past, people were sure that adding the <priority>
attribute to sitemaps would signal to Google that specific URLs would need to be prioritized by Google. Unfortunately, it doesn’t do anything, as Google has often mentioned that they do not use this attribute to read and prioritize content found in the sitemaps.
Check your own XML sitemap!
Now, you know how important it is to have an XML sitemap: having one can help your site’s SEO. If you add the correct URLs, Google can easily access your most important pages and posts. Google will also find updated content easily, so they know when a URL needs to be crawled again. Lastly, adding your XML sitemap to Google Search Console helps Google find your sitemap fast and allows you to check for sitemap errors.
So check your XML sitemap and find out if you’re doing it right!
Coming up next!
-
Event
WordCamp Netherlands 2024
November 29 - 30, 2024 Team Yoast is at Sponsoring WordCamp Netherlands 2024! Click through to see who will be there, what we will do, and more! See where you can find us next » -
SEO webinar
Webinar: How to start with SEO (November 19, 2024)
19 November 2024 Learn how to start your SEO journey the right way with our free webinar. Get practical tips and answers to all your questions in the live Q&A! All Yoast SEO webinars »