Original Research · Data Study

What AI engines actually cite: 37,547 citations analyzed

The short version

  • We analyzed 37,547 citations across 5,200 AI answers from ChatGPT, Claude, Gemini, Perplexity and Google AI Overviews.
  • Only about 5% of citations pointed to the brand's own website. Roughly 19 of every 20 went to a third party.
  • Citation behavior is wildly different per engine: from a citation in 38% of ChatGPT answers to 100% of AI Overviews.
  • Review sites, directories and social platforms dominate, and what gets cited is heavily category specific. GEO is not just on-page SEO.

Everyone optimizing for AI search is asking the same question: what do the engines actually cite when they answer? Most advice is guesswork. So we looked at the data. Across a set of brands we track, we pulled every source that ChatGPT, Claude, Gemini, Perplexity and Google AI Overviews cited over a recent window and counted them. The result is one of the larger public looks at AI citation behavior we are aware of, and the headline finding is uncomfortable for anyone who has spent a decade polishing their own website.

5% of AI citations point to the brand's own site
95% point to third parties

Brand-owned domains were only about 5% of all 37,547 citations in this dataset. AI overwhelmingly repeats what other sources say about a brand, not what the brand says about itself.

How we measured this

The numbers below come from llemmy's own platform data, aggregated and anonymized. The sample:

5,200
AI answers analyzed
37,547
citations counted
2,367
unique domains cited
5
AI engines

The answers come from three brands in three different industries, a luxury travel brand, a local-services brand, and a membership and insurance brand, queried across ChatGPT, Claude, Gemini, Perplexity and Google AI Overviews during June 2026. Brand names and their direct competitors are withheld. This is a real operating dataset, not a survey, so treat it as a strong directional signal rather than a census of the whole web. Where a pattern holds across all three very different verticals, we call it out, because that is where it is most likely to generalize.

Finding 1: AI cites other people, not you

The 5% figure is the one to sit with. You can write the perfect page, and in the overwhelming majority of cases the engine will still build its answer from somewhere else. When we grouped all 37,547 citations by the type of source, the picture got clearer:

Share of all citations, by source type

Long tail (category sites)56%
Review & directory15%
Social platforms14%
Brand-owned5%
Search engines4%
Forums & UGC4%
News & editorial2%

Review sites, directories and social together outweigh every brand-owned site combined by roughly six to one.

Review and directory sites (the Yelps, comparison sites and "best of" listings of the world) were the single largest recognizable category at about 15%, with social platforms close behind at 14%. The largest slice of all, 56%, was a long tail of category-specific sites: insurance comparison tools for one brand, local service directories for another, travel-specific publishers for the third. There was almost no overlap between those long tails across verticals, which is itself a finding.

Finding 2: a handful of platforms show up everywhere

While the long tail is category specific, a small set of horizontal platforms appeared across all three brands and most engines. These are the sources worth a place on no matter what you sell:

Most-cited individual domains (share of all citations)

Facebook6.6%
Instagram5.8%
Google3.8%
Reddit3.5%
Yelp2.8%

Social profiles and review platforms are the most-cited individual domains. Reddit alone outcites every news outlet in the dataset.

Two things stand out. First, your social profiles are GEO surface area: an active, accurate Facebook or Instagram presence is cited more than most of what you publish on your own domain. Second, Reddit is a heavyweight, the single most-cited forum, ahead of every traditional publisher we saw. If a thread is shaping the conversation in your category, the engines are reading it.

Finding 3: every engine cites differently

If you treat "AI search" as one channel, you will get this wrong. The share of answers that included any citation ranged from barely over a third to every single answer:

Share of answers that include at least one citation

Google AI Overviews100%
Perplexity90%
Claude66%
Gemini44%
ChatGPT38%

Citation rate by engine. The most widely used assistant, ChatGPT, was the least likely to cite a source.

Citation depth varied just as much. When an engine did cite, Claude pulled in the most sources by far, over 20 per answer on average, while Perplexity and Gemini sat around 10, and ChatGPT and AI Overviews stayed near 5. The takeaways:

Across everything, about 68% of all answers carried at least one citation, but as the chart shows, that average hides a huge spread.

What this means for your GEO strategy

Put the three findings together and the playbook writes itself:

The bigger picture

For twenty years, search optimization meant making your own pages better and hoping to rank. AI search breaks that model. The engine assembles an answer from a web of third-party sources and hands the user a verdict, and in our data your own site is a rounding error in that mix. Generative Engine Optimization is less about your homepage and more about your reputation across the sources AI trusts. The brands that win the AI answer will be the ones that show up, accurately and often, everywhere the engines look.

FAQ

What sources do AI engines cite most?

In our analysis of 37,547 citations, the most-cited individual domains were social platforms (Facebook and Instagram), Google, Reddit and review or directory sites like Yelp. By type, review and directory sites were about 15% of citations and social about 14%, while brand-owned sites were only about 5%. The largest share, roughly 56%, came from a long tail of category-specific sites.

How often does AI cite a brand's own website?

Rarely. Brand-owned domains were only about 5% of all citations in our dataset, meaning roughly 19 of every 20 AI citations pointed to a third party. AI engines mostly repeat what other sources say about a brand rather than what the brand says about itself.

Do different AI engines cite at different rates?

Yes, dramatically. Google AI Overviews carried a citation in essentially every answer we measured (about 100% over the June 2026 window) and Perplexity in about 90%, while Claude cited in about 66%, Gemini in about 44%, and ChatGPT in only about 38%. Citation depth ranged from around 5 sources per cited answer on ChatGPT and AI Overviews to over 20 on Claude.

What does this mean for GEO strategy?

Because AI mostly cites third parties, GEO is not just on-page SEO. You have to earn presence on the review sites, directories, social platforms and community forums engines actually cite, and tailor the approach per engine. Optimizing only your own website addresses about 5% of where citations come from.

Original research by the llemmy team, June 2026. Data aggregated and anonymized from the llemmy platform. Related reading: What makes a page AI-readable, The AI Citation Gap, and How to track your brand across AI engines.

See how AI describes your brand

Run a free GEO audit — no signup needed to see your score — or start tracking your brand across every AI engine.