Why isn't your site showing in Google? Start with robots.txt and sitemap.xml
You paste a URL, pick a mode (`robots.txt` alone, `sitemap.xml` alone, or Both together) and hit Check. Our server fetches the publicly-accessible files, parses them and shows you exactly what Googlebot would see when it visits your domain.
The validator does three things you can't do from the browser:
- Pulls `robots.txt` from the actual origin, not your CDN cache, the same bytes a crawler would get;
- Simulates real bots: Googlebot, Bingbot, GPTBot, ChatGPT-User, ClaudeBot. Pick a bot from the chips and you see exactly the rules that apply to it (with longest-prefix matching, the algorithm Google really uses);
- Parses sitemap.xml (including a sitemap index with nested sitemaps), checks the spec limits (50,000 URLs, 50 MB), validates W3C/ISO-8601 dates, `changefreq`, `priority` and surfaces duplicate `<loc>` entries.
Everything comes back as a tidy report with errors (red), warnings (yellow) and info (gray). Plus a URL tester, paste `/admin` or `/private/reports.pdf` and instantly see "allowed" or "disallowed" for the selected bot.
Why bother? The single most common reason a new site never gets indexed is a typo in robots.txt (`Disllow: /` instead of `Disallow: /admin`) or no sitemap link in robots.txt. The validator catches both in 5 seconds.
How to use it
- Pick a mode in the segmented bar at the top. If unsure, choose "Both", we'll fetch `/robots.txt` first, find the sitemap link inside, and pull that too.
- Paste your URL into the URL field. Bare domain (`example.com`), full URL (`https://example.com`) or a direct link to a sitemap (`https://example.com/sitemap.xml`) all work.
- Hit "Check" (or press Enter). The server fetches with a 10-second timeout and a 50 MB cap, so even huge sitemaps won't stall the validation.
- The robots.txt section shows: HTTP status, file size, group count, total Allow/Disallow rules. Issues are split into 3 severity levels (error / warning / info), each with the line number where it lives.
- Per-bot view, click the bot chips (Googlebot, Bingbot, GPTBot, ChatGPT-User and others). You see exactly the rules that apply to that bot, plus we tell you which User-Agent token in your file matched.
- URL tester, type any path (e.g. `/admin` or `/api/users`), see "Allowed" or "Disallowed" plus the exact rule that decided. Perfect for figuring out why a specific URL is missing from Google.
- The sitemap section shows: type (urlset / sitemapindex), URL count, `lastmod` coverage (%), newest and oldest date, plus a sample of the first 100 URLs in a table. If it's a sitemap index, we automatically fetch the nested sitemaps (up to 50 for safety).
When this is useful
Five situations where the validator saves you a weekend in Search Console:
- New site won't index in Google. You check `robots.txt`, the validator flags `Disallow: /` under `User-agent: *` (the classic dev-environment leftover). You change it to `Disallow: /admin` and indexing starts within 24 hours.
- Domain migration or redesign. After moving to a new platform, you validate the old and the new sitemap. The validator shows 1,200 URLs missing in the new one (forgotten language prefix). You fix it in the CMS before Google notices the drop.
- SEO audit before a big launch. A client asks "why isn't the shop showing in search". The validator finds `User-agent: Googlebot` + `Disallow: /products`, someone (knowingly or not) blocked the whole product catalogue. You'd never have spotted that without the per-bot view.
- GPTBot, ClaudeBot, Google-Extended. You want to opt out of AI training on your content. The validator's per-bot view shows whether your `Disallow: /` for `GPTBot` actually applies, or whether it's overridden by an earlier `*` group with `Allow: /`.
- CI/CD pre-deploy checks. Plug the validator into your pipeline (a plain `curl` with JSON does it) and builds fail when `robots.txt` has `Disallow: /` under `User-agent: *`. Selling that to a senior DevOps takes 10 minutes. Savings, thousands.
Need to author the files? Generate them in the robots.txt builder and the sitemap.xml builder. For social previews of the same URLs, use the OpenGraph preview.