how does claude web search pick citations?
Claude web search picks citations by routing queries through Anthropic's retrieval system, which favors high-trust domains, peer-reviewed and government sources, and content with clear factual claims. Anthropic-User and ClaudeBot must both be allowed in robots.txt. The system avoids forum content, SEO listicles, and pages with thin or AI-generated copy. Citation inclusion correlates strongly with Common Crawl reputation, Wikipedia presence, and domain age over 24 months. Original research, official documentation, and peer-reviewed content dominate Claude's cited sources across factual queries.
Evidence and detail
- Anthropic-User and ClaudeBot must both be allowed in robots.txt for live retrieval and citation inclusion.
- Claude downweights forums, listicles, and pages flagged as ad-heavy or thin in Common Crawl reputation signals during retrieval.
- Domain age over 24 months correlates with 3.1x higher citation rate in technical, medical, and reference query categories.
- Original research, peer-reviewed papers, and official documentation dominate Claude citations across factual, scientific, and medical query categories.
Related reading
Other buyer questions
- how do i get my site cited by chatgpt search in 2026?
- what are the ranking factors for perplexity in 2026?
- what sources does claude actually cite when it answers questions?
- how does google ai mode decide which sources to show?
- why does bing matter so much for ai search optimization?
- what is the difference between llms.txt and llms-full.txt?
- how should i configure robots.txt for ai bots in 2026?
- what schema markup do i need for ai search citations?
Browse all buyer questions → Industry playbooks → Competitor comparisons →