Windows Generator
- Quick start
- Installation
- Basic setup
- Sitemap options
- Filtering pages
- Sitemap images
- FTP Settings
- Spidering
- Editing pages
- Editing sitemap images
- Exporting sitemaps
- Change log
HTML Concepts
Robots.txt sitemap entries
The robots.txt
file is a file based tool for controlling how search engines crawl and index your website. One of its key features is the ability to specify the location of your XML sitemap(s). Here’s a step-by-step guide on how to do it:
Create or Access Your robots.txt
File:
If you don’t already have a robots.txt
file, you can create one in the root directory of your website. If it already exists, make sure you have access to edit it.
Specify the Sitemap Location:
To specify the location of your XML sitemap, your robots.txt file should look something like this :
User-agent: Googlebot
Disallow: /nogooglebot/
User-agent: *
Allow: /
Sitemap: https://www.example.com/sitemap.xml
You may also want to add an Allow as a hint about your HTML sitemap. This will provide a navigable page for other spiders that follow web links instead of XML sitemaps.
User-agent: Googlebot
Disallow: /nogooglebot/
User-agent: *
Allow: /
Allow: /sitemap.htm
Sitemap: https://www.example.com/sitemap.xml
You should also link to your HTML sitemap from one of your active pages to ensure it is naviagble and not orphaned. This can be a subtle / discrete link, but it will help the discoverability with spiders if it is linked high up in your website such as from your homepage.
Save and test
Save your robots.txt
file and make sure it’s accessible on your web server.
To ensure it’s correctly configured, you can use Google’s “robots.txt Tester” tool in Google Search Console or similar tools provided by other search engines.
Check search engine crawling
After specifying your sitemap location in robots.txt
, monitor your website for crawling, including activity logs and webmaster tools.
Keep your robots.txt
updated
If you make changes to your XML sitemap(s) or need to add additional sitemaps, remember to update your robots.txt
accordingly.
By following these steps, you can effectively inform search engines about the location of your XML sitemaps, enabling them to discover and index your website’s content more efficiently.