What is an XML sitemap?
An XML sitemap is a map of all the pages on your website that you want Google to find and index. Having a properly formatted and complete XML sitemap is important for SEO, especially if your internal linking isn’t very good. Your sitemap will lead Google straight to the important pages on your website, even if you don’t have internal links.
Sometimes important pages on your website mistakenly don’t have any internal links pointing to them, so it’s a good rule of thumb to ensure all important pages are in your sitemap, and then improve your internal linking process.
There are three basics checks you can make to ensure your sitemap is in the best condition:
Are all important URLs listed?
If you have a static XML sitemap, it’s likely that it is now outdated, as it will be a snapshot of what the website was like at the time of creation unless it has since been amended. Dynamic sitemaps are much more efficient as it saves you time having to amend your sitemap every time you make a URL change, however, it is highly recommended that you regularly check the settings so that important URLs or website sections aren’t excluded from the sitemap.
To check this, you will need to download a tool that enables you to do a deep crawl of all the URLs on your website and compare them to URLs listed in the sitemap.
Do you need to remove old/broken URLs?
Your sitemap should only contain URLs you want to be indexed that return a 200-response code and have been internally linked within your website. Based on this, the following should NOT feature in your sitemap:
- URLs that are blocked by your robots.txt file – read our guide on the importance of a robots.txt file
- Noindexed URLs
- Paginated URLs
- Canonicalised URLs
- Removed URLs
Similarly to the previous check, crawling your website will help you identify any URLs you may not want included in your sitemap.
Have all my XML sitemap URLs been indexed by Google?
To see the full picture of which URLs have and haven’t been indexed by Google, you must submit your XML sitemap in Google Search Console. This will then show you which URLs have been successfully indexed and which have encountered errors. URLs that come back as errors are likely:
- URLs which return 404 errors
- Duplicate URLs
- Crawled URLs that aren’t currently indexed
- URLs that have been found but not yet indexed
Do you need help with your XML sitemap and general SEO maintenance? Our team have years of experience in boosting SEO performance for all our clients. Head over to our SEO services page to see how we can help, or read our case studies.