Check whether a site has a valid robots.txt before launch. We parse crawl rules, disallow paths, sitemap directives, and crawl delay so you can catch configuration mistakes before they block your pages from search engines.
Reads every User-agent block and lists disallowed and allowed paths.
Checks for Sitemap: directives so crawlers can find your pages.
Returns practical recommendations for what to fix before going live.
Part of the checklist
This checker is part of our startup launch SEO checklist. After robots.txt, the natural next checks are sitemap validation, metadata, Open Graph previews, and favicon setup.
Read the full launch checklistWe'll show the raw robots.txt content once a site has been checked.
A robots.txt file lives at the root of your domain (e.g., yourdomain.com/robots.txt) and tells search engine crawlers which pages they may or may not visit. It is the first file most crawlers fetch when they arrive at a new site.
A misconfigured robots.txt can silently block your entire site from Google. The most common mistake is a stale Disallow: / rule left over from staging that never got removed before launch. It also matters because referencing your sitemap here speeds up how quickly crawlers discover all your pages.
What is a robots.txt file?
A robots.txt file is a plain text file placed at the root of a website (e.g., example.com/robots.txt) that tells search engine crawlers which pages or sections of the site they are allowed or not allowed to visit. It is part of the Robots Exclusion Protocol.
Why does robots.txt matter for SEO?
robots.txt controls what search engines crawl. If important pages are accidentally blocked, they won't appear in search results. It also lets you point crawlers to your sitemap so they can discover all your pages faster.
What does Disallow: / mean in robots.txt?
Disallow: / tells crawlers they are not allowed to access any page on the site. This effectively blocks your entire site from being indexed by search engines. It is almost always a mistake if found on a live production site.
Should I add my sitemap to robots.txt?
Yes. Adding a Sitemap: directive to robots.txt is a best practice. It allows all crawlers — not just Googlebot — to discover your sitemap automatically without needing to submit it manually to every search engine.
Does robots.txt prevent pages from being indexed?
robots.txt prevents crawlers from visiting those pages, but it does not guarantee they won't appear in search results. If other sites link to a disallowed page, search engines may still list it without content. To fully prevent indexing, use a noindex meta tag instead.