cited?← all posts
AEO2 min read

robots.txt Blocking GPTBot? How to Fix It

By ShlokPublished

The short answer

If robots.txt blocks GPTBot, remove the Disallow rule for that user-agent and confirm no CDN or WAF re-blocks it at the edge. Many sites are blocked by a Cloudflare 'block AI bots' toggle that overrides their robots file. After fixing both layers, re-test with the GPTBot user-agent and confirm a 200 response.

Why GPTBot access matters

GPTBot is OpenAI's crawler. If it's blocked, ChatGPT can't fetch your content, which removes you from consideration for citations in many ChatGPT answers. The block is often unintentional — a default rule or a one-click CDN setting.

Find the block

  • Check robots.txt for `User-agent: GPTBot` followed by `Disallow: /`.
  • Check your CDN/WAF for managed AI-bot blocking (e.g. Cloudflare AI Crawl Control or bot-fight rules).
  • Test a live fetch with the GPTBot user-agent — a 403 means an edge block even if robots.txt looks fine.

Fix it

In robots.txt, remove the Disallow for GPTBot or replace it with `Allow: /` (keeping any private paths disallowed). In your CDN/WAF, disable AI-bot blocking or whitelist GPTBot. Watch for managed robots.txt features that inject blocks above your own rules.

Verify

Re-fetch with the GPTBot user-agent and confirm a 200. cited? re-checks GPTBot and every other major AI crawler automatically and shows the result per engine.

Frequently asked questions

+Should I allow GPTBot if I don't want my content used for training?

GPTBot is used for both crawling and training. If your goal is AI-answer visibility, allowing it helps. If you want answer visibility but not training, review OpenAI's documented controls and the separate OAI-SearchBot agent, and decide per your policy.

+My robots.txt allows GPTBot but it's still blocked. Why?

A CDN or WAF is likely blocking it at the edge. Cloudflare and similar services can block AI bots or inject managed robots.txt rules that override yours. Disable that feature or whitelist the bot, then re-test.

+How do I confirm the fix worked?

Send a request using the GPTBot user-agent and verify a 200 response, or run an AEO scanner that probes the live user-agent and reports access per crawler.

Related