TechnicalMarch 14, 2026·7 min read

Robots.txt for AI Crawlers: The Complete 2026 Configuration Guide

⚠️ Critical Finding: In Audit 1 of getoutloop.com, the robots.txt was missing entries for all major AI crawlers. Result: GEO Technical score of 0/15. One file change fixed this entirely.

What is robots.txt and Why Does It Matter for GEO?

robots.txt is a plain text file located at yourdomain.com/robots.txt that tells web crawlers which pages they can and cannot access. Every major search engine and AI platform respects this file before indexing your content.

For GEO purposes, robots.txt is the most critical single technical file on your website. If GPTBot, ClaudeBot, or PerplexityBot are blocked — either explicitly or by a catch-all restriction — those AI platforms cannot read your content. They will never cite you. Your GEO score will be zero regardless of how well you've optimized everything else.

Major AI Crawlers Reference Table

User-agent	AI Platform	Priority
GPTBot	OpenAI (ChatGPT)	Tier 1
OAI-SearchBot	OpenAI Search	Tier 1
ClaudeBot	Anthropic (Claude)	Tier 1
PerplexityBot	Perplexity AI	Tier 1
Google-Extended	Google (Gemini / AI Overviews)	Tier 1
Bingbot	Microsoft (Copilot)	Tier 1
Applebot-Extended	Apple Intelligence	Tier 2
FacebookBot	Meta AI	Tier 2
Amazonbot	Amazon / Alexa	Tier 2
cohere-ai	Cohere	Tier 2

Copy-Paste robots.txt Template

Save this as robots.txt in your website root directory:

robots.txt

User-agent: *
Allow: /
Disallow: /private/
Disallow: /admin/
Crawl-delay: 1

# === AI SEARCH INDEXING (ALLOW ALL — GEO Visibility Strategy) ===

# ChatGPT / OpenAI
User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

# Claude / Anthropic
User-agent: ClaudeBot
Allow: /

User-agent: anthropic-ai
Allow: /

# Perplexity AI
User-agent: PerplexityBot
Allow: /

# Google AI (Gemini, AI Overviews)
User-agent: Google-Extended
Allow: /

User-agent: GoogleOther
Allow: /

# Microsoft Bing / Copilot
User-agent: Bingbot
Allow: /

# Apple Intelligence
User-agent: Applebot
Allow: /

User-agent: Applebot-Extended
Allow: /

# Meta AI
User-agent: FacebookBot
Allow: /

# Amazon Alexa / AI
User-agent: Amazonbot
Allow: /

# Common Crawl (AI Training)
User-agent: CCBot
Allow: /

# Cohere AI
User-agent: cohere-ai
Allow: /

# === STANDARD SEARCH ENGINES ===

User-agent: Googlebot
Allow: /

User-agent: Slurp
Allow: /

User-agent: DuckDuckBot
Allow: /

# === SITEMAP ===

Sitemap: https://ronnelbesagre.com/sitemap.xml

How to Verify Your robots.txt Works

1.Visit https://yourdomain.com/robots.txt in your browser — you should see the plain text content
2.Use Google Search Console → Settings → robots.txt tester to verify each crawler user-agent
3.Test with: curl -A "GPTBot" https://yourdomain.com/ — should return 200 OK
4.Run a GEO audit using the /seo-geo-audit skill — AI Crawler Access score should jump to 12+/15

Want Me to Audit Your robots.txt?

The free AI Visibility Audit includes a full robots.txt review + all GEO technical gaps with a prioritized fix plan.

Get Free Audit