An XML Sitemap is a special document that lists all the pages on your website, it includes additional information such as the date the page was last modified (lastmod), the priority and how often the page changes (changefreq).
Let’s take an example news website which has 50,000 pages, if one news article is updated on the site, it might take up to a month for search engines to (i) discover the page has changed and (ii) update their copy of the page. This is dependent on the crawl budget instigated by the search engine. Using an XML sitemap, we can prompt search engines that the article has been updated using the lastmod date, this will expedite the process of search engines updating their index with the latest copy of your webpage.
XML sitemaps are typically placed in the root of your website e.g. https://bravr.com/sitemap.xml
XML Sitemaps are designed primarily for search engines and written in XML, here is an example of an XML sitemap extracted from Bravr.com.
https://bravr.com/ 2022-06-14T19:55:25+02:00 https://bravr.com/blog/ 2022-06-24T10:23:20+02:00
Let’s break down the individual parts:
This header denotes the XML sitemap is structured according the version 1.0 of the XML standard and UTF-8 character encoding. This is used to tell search engines what to expect and how to parse the content.
This groups the list of URLs contained within the XML sitemap. The 0.9 describes which version of the XML Sitemap standard is used. This tag is closed at the bottom of the XML Sitemap. Within the there are multiple values:
For every single web page, there will be a separate tag, within the tag is a which is short for location. The value of the tag should be the full URL including the protocol (e.g. https:// ).
There are additional tags which can be included within the group for each URL, these are:
- lastmod: the date of when the content on that URL was last modified. The date is in “W3C datetime” format e.g. 2022-03-21 15:01 +00:00.
- priority: the priority of the URL, relative to your own website on a scale between 0.0 and 1.0. Normally homepages should be set to 1, followed by your top level pages set to 0.9 and child pages set to 0.5. Setting everything to 1, means all pages are equal.
- changefreq: how often the web page is expected to change. The acceptable values are: hourly, daily, weekly, monthly, early and never.
We wouldn’t worry too much about the lastmod, priority and changefreq, these tags have been abused by Webmasters so search engines may ignore them.
As part of any SEO Strategy its important your XML sitemap is configured correctly, we often see the following errors:
- URL discovered in crawls which are not included in XML Sitemaps.
- URLs included in XML sitemaps which should not be included.
- Redirects submitted in XML Sitemaps.
- 404’s submitted in XML Sitemaps.
- Noindex pages submitted in XML Sitemaps.
If you need an XML sitemap audit or help setting up an XML sitemap, please contact us