What Is a robots.txt File and How Do You Optimise It?
Every website has a robots.txt file — or at least should have one. It’s a simple text file that lives at the root of your domain and tells search engine crawlers which parts of your site they’re allowed to access. Get it right and you guide Google to your most important content; get it wrong and you risk accidentally blocking pages from appearing in search results entirely.
Despite being one of the most fundamental SEO files, robots.txt is also one of the most misunderstood. This guide explains what it is, how to read and write the rules it contains, and what you should and shouldn’t use it for.
What robots.txt does and how crawlers use it
When a search engine crawler like Googlebot arrives at your website, it first checks for a file at yourwebsite.com/robots.txt. This file contains a set of instructions that tell the crawler which URLs it may and may not visit. The crawler is expected to respect these instructions before it does anything else on your site.
A robots.txt file is made up of “user-agent” directives (specifying which crawler the rules apply to, such as Googlebot or all crawlers using the asterisk wildcard) and “disallow” or “allow” directives (specifying which paths to block or permit). For example, “Disallow: /admin/” would tell all crawlers to avoid any URL beginning with /admin/.
Critically, robots.txt controls crawling, not indexing. Blocking a page in robots.txt prevents Google from visiting it, but if that page has links pointing to it from elsewhere, Google may still include it in its index as a bare URL — without any content. For full removal from the index, you need a noindex meta tag on the page itself, and Google needs to be able to crawl the page to read that tag.
Common things to block and what to leave open
Most websites benefit from blocking certain backend or utility paths from crawlers. Admin panels (/admin/, /wp-admin/), internal search results pages, staging subdirectories, and cart/checkout pages on e-commerce sites are all reasonable candidates. These pages offer no value to search engines and crawling them wastes the budget Google allocates to your site.
What you should never block are your important content pages — product pages, service pages, blog posts, and the homepage. A surprisingly common mistake is accidentally blocking CSS and JavaScript files. If Google can’t access your site’s styling and scripts, it can’t properly render your pages and assess their content, which can harm your rankings.
URL parameter variations that create duplicate content — like ?sort=price or ?page=2 — are sometimes blocked in robots.txt, but the canonical tag or Google Search Console’s URL parameter tool is usually a better solution because it handles the indexing side as well as the crawling side.
How to check and optimise your robots.txt file
You can view any site’s robots.txt file by navigating to yourwebsite.com/robots.txt in a browser. For your own site, Google Search Console has a robots.txt tester (under Legacy Tools) that shows how Googlebot interprets each rule and lets you test specific URLs to see whether they’re allowed or blocked.
When optimising your robots.txt, less is often more. Many websites accumulate disallow rules over time that are outdated or unnecessary. Review the file regularly and remove any rules that don’t serve a clear purpose. Overly complex robots.txt files are harder to maintain and easier to break.
One useful addition to your robots.txt file is a Sitemap directive — a line that tells crawlers the location of your XML sitemap. This helps Google find and process your sitemap even if you haven’t submitted it manually via Search Console. Add a line such as “Sitemap: https://www.yourwebsite.com/sitemap.xml” at the bottom of the file.
Common questions.
What happens if I don’t have a robots.txt file?
Can robots.txt block Google from indexing a page completely?
Is robots.txt the same as blocking pages with a password?
More on web design & ux.
Want a hand putting this into practice?
Book a free, no-obligation consultation with a Norwich-based specialist.
Let's put your business in a better light.
Book a free, no-pressure consultation. We'll talk through your goals and tell you honestly what we'd do — whether you work with us or not.