What is a robots.txt file?
Robots.txt is a text file webmasters create to instruct web robots (typically search engine robots) how to crawl pages on their website.
How does a robots.txt file help SEO?
When setup correctly, you can stop crawl bots accessing pages you don’t want to be indexed. You can also use it to indicate the location to your sitemap.
Why do you need a robots.txt file?
Robots.txt files control crawler access to certain areas of your site. While this can be dangerous if you accidentally disallow Googlebot from crawling your entire site, there are some situations in which a robots.txt file can be very handy.
Some common use cases include:
- Preventing duplicate content from appearing in SERPs (note that meta robots is often a better choice for this)
- Keeping entire sections of a website private (for instance, your engineering team’s staging site)
- Keeping internal search results pages from showing up on a public SERP
- Specifying the location of sitemap(s)
- Preventing search engines from indexing certain files on your website (images, PDFs, etc.)
- Specifying a crawl delay in order to prevent your servers from being overloaded when crawlers load multiple pieces of content at once
If there are no areas on your site to which you want to control user-agent access, you may not need a robots.txt file at all.