Robots.txt sitemap entries

The robots.txt file is a file based tool for controlling how search engines crawl and index your website. One of its key features is the ability to specify the location of your XML sitemap(s). Here’s a step-by-step guide on how to do it:

Create or Access Your robots.txt File:

If you don’t already have a robots.txt file, you can create one in the root directory of your website. If it already exists, make sure you have access to edit it.

Specify the Sitemap Location:

To specify the location of your XML sitemap, your robots.txt file should look something like this :

User-agent: Googlebot
Disallow: /nogooglebot/

User-agent: *
Allow: /

Sitemap: https://www.example.com/sitemap.xml

You may also want to add an Allow as a hint about your HTML sitemap. This will provide a navigable page for other spiders that follow web links instead of XML sitemaps.

User-agent: Googlebot
Disallow: /nogooglebot/

User-agent: *
Allow: /
Allow: /sitemap.htm

Sitemap: https://www.example.com/sitemap.xml

You should also link to your HTML sitemap from one of your active pages to ensure it is naviagble and not orphaned. This can be a subtle / discrete link, but it will help the discoverability with spiders if it is linked high up in your website such as from your homepage.

Save and test

Save your robots.txt file and make sure it’s accessible on your web server.

To ensure it’s correctly configured, you can use Google’s “robots.txt Tester” tool in Google Search Console or similar tools provided by other search engines.

Check search engine crawling

After specifying your sitemap location in robots.txt, monitor your website for crawling, including activity logs and webmaster tools.

Keep your robots.txt updated

If you make changes to your XML sitemap(s) or need to add additional sitemaps, remember to update your robots.txt accordingly.

By following these steps, you can effectively inform search engines about the location of your XML sitemaps, enabling them to discover and index your website’s content more efficiently.