Technical·7 min read

Robots.txt for AI Crawlers: The Complete 2026 Configuration Guide

⚠️ Critical Finding: In Audit 1 of getoutloop.com, the robots.txt was missing entries for all major AI crawlers. Result: GEO Technical score of 0/15. One file change fixed this entirely.

What is robots.txt and Why Does It Matter for GEO?

robots.txt is a plain text file located at yourdomain.com/robots.txt that tells web crawlers which pages they can and cannot access. Every major search engine and AI platform respects this file before indexing your content.

For GEO purposes, robots.txt is the most critical single technical file on your website. If GPTBot, ClaudeBot, or PerplexityBot are blocked — either explicitly or by a catch-all restriction — those AI platforms cannot read your content. They will never cite you. Your GEO score will be zero regardless of how well you've optimized everything else.

Major AI Crawlers Reference Table

User-agentAI PlatformPriority
GPTBotOpenAI (ChatGPT)Tier 1
OAI-SearchBotOpenAI SearchTier 1
ClaudeBotAnthropic (Claude)Tier 1
PerplexityBotPerplexity AITier 1
Google-ExtendedGoogle (Gemini / AI Overviews)Tier 1
BingbotMicrosoft (Copilot)Tier 1
Applebot-ExtendedApple IntelligenceTier 2
FacebookBotMeta AITier 2
AmazonbotAmazon / AlexaTier 2
cohere-aiCohereTier 2

Copy-Paste robots.txt Template

Save this as robots.txt in your website root directory:

robots.txt
User-agent: *
Allow: /
Disallow: /private/
Disallow: /admin/
Crawl-delay: 1

# === AI SEARCH INDEXING (ALLOW ALL — GEO Visibility Strategy) ===

# ChatGPT / OpenAI
User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

# Claude / Anthropic
User-agent: ClaudeBot
Allow: /

User-agent: anthropic-ai
Allow: /

# Perplexity AI
User-agent: PerplexityBot
Allow: /

# Google AI (Gemini, AI Overviews)
User-agent: Google-Extended
Allow: /

User-agent: GoogleOther
Allow: /

# Microsoft Bing / Copilot
User-agent: Bingbot
Allow: /

# Apple Intelligence
User-agent: Applebot
Allow: /

User-agent: Applebot-Extended
Allow: /

# Meta AI
User-agent: FacebookBot
Allow: /

# Amazon Alexa / AI
User-agent: Amazonbot
Allow: /

# Common Crawl (AI Training)
User-agent: CCBot
Allow: /

# Cohere AI
User-agent: cohere-ai
Allow: /

# === STANDARD SEARCH ENGINES ===

User-agent: Googlebot
Allow: /

User-agent: Slurp
Allow: /

User-agent: DuckDuckBot
Allow: /

# === SITEMAP ===

Sitemap: https://ronnelbesagre.com/sitemap.xml

How to Verify Your robots.txt Works

  1. 1.Visit https://yourdomain.com/robots.txt in your browser — you should see the plain text content
  2. 2.Use Google Search Console → Settings → robots.txt tester to verify each crawler user-agent
  3. 3.Test with: curl -A "GPTBot" https://yourdomain.com/ — should return 200 OK
  4. 4.Run a GEO audit using the /seo-geo-audit skill — AI Crawler Access score should jump to 12+/15

Want Me to Audit Your robots.txt?

The free AI Visibility Audit includes a full robots.txt review + all GEO technical gaps with a prioritized fix plan.

Get Free Audit
RB
Ronnel Besagre

GEO/SEO Consultant · AI Automation Specialist

GEO pioneer helping APAC businesses maximize their visibility in AI search engines. Based in Johor Bahru, Malaysia.