how should i configure robots.txt for ai bots in 2026?
Configure robots.txt to explicitly allow citation-driving bots and decide separately on training bots. Always allow OAI-SearchBot, ChatGPT-User, PerplexityBot, Anthropic-User, ClaudeBot, Google-Extended, and Bingbot for retrieval and citation. Block GPTBot, CCBot, and Google-Extended only if you do not want training inclusion, knowing this also reduces future model knowledge of your brand. Most commercial sites benefit from allowing both retrieval and training bots since training inclusion drives brand recall in zero-click queries.
Evidence and detail
- OAI-SearchBot drives ChatGPT live citations; GPTBot drives training inclusion; these are two separate access decisions to make.
- Blocking PerplexityBot removes you from Perplexity answers within roughly 7 days based on observed deindexing.
- Google-Extended controls Gemini training but does not affect AI Mode citations from the live index.
- Roughly 22 percent of audited sites accidentally block citation bots through overly broad disallow rules.
Related reading
Other buyer questions
- how do i get my site cited by chatgpt search in 2026?
- what are the ranking factors for perplexity in 2026?
- what sources does claude actually cite when it answers questions?
- how does google ai mode decide which sources to show?
- why does bing matter so much for ai search optimization?
- what is the difference between llms.txt and llms-full.txt?
- what schema markup do i need for ai search citations?
- what is indexnow and why does it matter for ai search?
Browse all buyer questions → Industry playbooks → Competitor comparisons →