Simulate how Googlebot and other crawlers apply Allow/Disallow rules from your robots.txt to specific URLs.
| URL | UA | Result | Matched rule |
|---|---|---|---|
| https://example.com/ | Googlebot | Allowed | L10: allow: / |
| https://example.com/admin/ | Googlebot | Allowed | L10: allow: / |
| https://example.com/admin/public/login | Googlebot | Allowed | L10: allow: / |
| https://example.com/private/file.pdf | Googlebot | Allowed | L10: allow: / |
| https://example.com/no-google/page | Googlebot | Blocked | L9: disallow: /no-google/ |
| https://example.com/blog/post-1 | Googlebot | Allowed | L10: allow: / |
| https://example.com/ | GPTBot | Blocked | L13: disallow: / |
| https://example.com/admin/ | GPTBot | Blocked | L13: disallow: / |
| https://example.com/admin/public/login | GPTBot | Blocked | L13: disallow: / |
| https://example.com/private/file.pdf | GPTBot | Blocked | L13: disallow: / |
| https://example.com/no-google/page | GPTBot | Blocked | L13: disallow: / |
| https://example.com/blog/post-1 | GPTBot | Blocked | L13: disallow: / |
Generate schema.org JSON-LD markup for Article, Product, FAQ, LocalBusiness, Person, Event, and Breadcrumbs. Form-based, copy-paste ready.
Build trackable URLs with utm_source / utm_medium / utm_campaign params. QR code generated automatically. GA4 / Universal Analytics compatible.
Paste HTML and preview how Facebook, Twitter, LinkedIn, and Slack will render your share card.
Validate sitemap.xml structure, lastmod format, hreflang alternates, and URL count limits — all in your browser.
robots.txt is the file at /robots.txt on every domain that tells crawlers — Googlebot, Bingbot, GPTBot, ClaudeBot, and hundreds of others — which paths they may or may not request. It uses two main directives, Allow and Disallow, plus User-agent groups that scope rules to specific bots. The matching rules feel intuitive at first but get tricky fast: User-agent precedence is "longest exact match wins, then fall back to *"; path precedence is "longest pattern wins, ties go to Allow"; and the wildcards * and $ have subtle meaning. A single misplaced trailing slash can leak your /admin/ directory to every search engine; an over-eager Disallow can deindex your entire site. This Robots.txt Tester parses any robots.txt body you paste, lets you choose one or more user-agents (Googlebot, GPTBot, ClaudeBot, etc.) and a list of URLs, and shows exactly which rule on which line would match each URL — implementing Google's public matching specification.
Per Google's spec: (1) UA matching — the User-agent group whose name is the longest case-insensitive prefix match of the crawler name wins; only if no specific group matches does User-agent: * apply. (2) Path matching — among all Allow/Disallow rules in the chosen group, the one with the longest pattern (excluding wildcards) wins. If two patterns have equal specificity, Allow beats Disallow. (3) Wildcards — * matches any sequence of characters, and $ at the end of a pattern anchors the URL's end. Crawl-delay is non-standard and ignored by Google (but honored by Bing/Yandex).