SV Scale Visibility.
Free Tool · Scale Visibility

AI Crawler robots.txt Analyzer

Paste your robots.txt to see which AI crawlers you're allowing or blocking — across OpenAI, Anthropic, Google, Perplexity, Apple, Meta, ByteDance, Amazon, and more. Most sites silently allow every AI crawler without realizing the trade-offs. This tool tells you exactly which bot does what so you can decide on purpose.

Paste your robots.txt below. We'll classify which AI crawlers you're allowing, blocking, or silently passing through — and explain the trade-off for each. Most site owners haven't audited this; blocking the wrong bot quietly removes your business from AI answers without you noticing.

·
Generate your corrected robots.txt

Pick the AI-crawler policy you want. Paste your current robots.txt above and we'll merge the policy into it; with nothing pasted you'll get a clean starter file.

Starter robots.txt
# robots.txt — AI-crawler policy applied with Scale Visibility's free tool
# https://www.scalevisibility.com/tools/robots-txt-analyzer/
# Existing non-AI rules are preserved as-is; the AI-crawler rules below
# reflect your chosen policy. Place this file at https://yourdomain.com/robots.txt

# ── AI crawlers: allow all (training + search/citation) ──
# Explicit Allow guarantees these bots are permitted even if your
# User-agent: * rule below is restrictive.

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: Claude-User
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Perplexity-User
Allow: /

User-agent: Amazonbot
Allow: /

User-agent: YouBot
Allow: /

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: anthropic-ai
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: Applebot-Extended
Allow: /

User-agent: Meta-ExternalAgent
Allow: /

User-agent: FacebookBot
Allow: /

User-agent: Bytespider
Allow: /

User-agent: CCBot
Allow: /

User-agent: cohere-ai
Allow: /

User-agent: omgili
Allow: /

User-agent: GoogleOther
Allow: /

User-agent: Diffbot
Allow: /

# Default for all other (non-AI) crawlers
User-agent: *
Allow: /
Why this matters for AI search

The most common misconception: people block GPTBot thinking it stops ChatGPT from "using" their content — but GPTBot is only for training. ChatGPT citations come from OAI-SearchBot and ChatGPT-User. Same trap with ClaudeBot (training) vs Claude-User/ Claude-SearchBot (citations). If you want AI visibility, the search/citation bots are the ones to allow — regardless of how you feel about training.

Beyond robots.txt — is your whole site AI-ready?

Get a free AI Search Readiness Audit

robots.txt is the access permissions layer. The audit also covers JS rendering, schema markup, llms.txt, SSR vs CSR, and how ChatGPT / Claude / Perplexity actually cite your business compared to competitors. 48-hour turnaround. No signup. No sales pitch.

Request your free audit →

Frequently asked questions

Add a directive for each crawler's user-agent — for example User-agent: GPTBot followed by Disallow: /. Paste your current robots.txt into the analyzer above and it will show which AI crawlers you allow or block today, then generate a corrected file for allow-all, search-only, or full-block while keeping your existing sitemaps and rules intact.

The ones that matter most: GPTBot and OAI-SearchBot (OpenAI), ClaudeBot and anthropic-ai (Anthropic), PerplexityBot (Perplexity), Google-Extended (Google's AI training), Applebot-Extended (Apple), Bytespider (ByteDance), Meta-ExternalAgent (Meta), and Amazonbot. The analyzer checks for all of these and more.

Not exactly — and this is the most common mistake. GPTBot is used mainly to gather training data. OAI-SearchBot is the crawler behind ChatGPT's live search and browsing. If you want to stay visible in ChatGPT search while opting out of training, you'd allow OAI-SearchBot but disallow GPTBot. Blocking the wrong one can quietly remove you from AI answers — the two-crawler trap is covered in more depth in our guide to llms.txt and robots.txt.

It's a trade-off. Blocking AI crawlers protects your content from training use, but it also makes you invisible when buyers ask AI assistants for recommendations in your field. For most businesses that want to be found, allowing the search-oriented crawlers (and deciding on training crawlers separately) is the right call. Run the AI Visibility Check to see whether AI assistants currently name you.

At the root of your domain — yourdomain.com/robots.txt. If you don't have one, every crawler is allowed by default. Once you've set your policy, a free AI Search Readiness Audit checks robots.txt alongside the rest of your AI-visibility signals.

More free tools

Other AI-visibility tools