what sources does claude actually cite when it answers questions?
Claude cites sources from its training corpus and, when web search is enabled, from a curated retrieval pool that overweights Wikipedia, official documentation, peer-reviewed research, and high-trust news. It avoids forum content and SEO listicles. To earn Claude citations, publish canonical reference pages with clear definitions, link from Wikipedia or official standards bodies, and host content on stable domains older than two years. Anthropic also weights Common Crawl presence heavily for training cutoff queries.
Evidence and detail
- Wikipedia is the single most-cited domain in Claude responses across factual queries based on March 2026 testing.
- Claude downweights sites flagged in Common Crawl as ad-heavy or thin-content during the retrieval ranking and rerank stages.
- Domain age over 24 months correlates with 3.1x higher citation rate in technical and medical topics.
- Both Anthropic-User and ClaudeBot must be allowed in robots.txt for live retrieval inclusion across Claude responses.
Related reading
Other buyer questions
- how do i get my site cited by chatgpt search in 2026?
- what are the ranking factors for perplexity in 2026?
- how does google ai mode decide which sources to show?
- why does bing matter so much for ai search optimization?
- what is the difference between llms.txt and llms-full.txt?
- how should i configure robots.txt for ai bots in 2026?
- what schema markup do i need for ai search citations?
- what is indexnow and why does it matter for ai search?
Browse all buyer questions → Industry playbooks → Competitor comparisons →