AI citations are won by verifiability, not scarcity

Most companies publishing about generative search optimization have never tested their own advice against an unbiased reviewer. We did — and the audit overturned one of our own assumptions about how content earns citations in AI answers. This is what we found, and what we changed because of it.

The short version

We had been building toward a single idea: make content that a competitor cannot reproduce, and AI engines will cite it because it is the only source for that information. It sounds right. It is half wrong.

When we put our method in front of independent reviewers with no knowledge of who built it, the verdict was consistent: engineering content to be non-reproducible optimizes for a business moat, not for a citation. Answer engines do not cite the rarest source. They cite the source that is easiest to verify and is corroborated elsewhere. Those are different goals, and we had been conflating them.

What the evidence actually says

Two findings did the most to change our mind, and both are worth checking yourself rather than taking on our word.

The first is from a peer-reviewed study (Aggarwal et al., presented at KDD 2024), which tested content interventions across thousands of queries on a generative-engine benchmark. The interventions that moved AI visibility the most were adding statistics and citing sources — each lifting visibility on the order of 30–40%. The lift came from the number being present and attributed, not from it being secret.

The second is a pattern reported across multiple 2025–2026 corpus analyses of AI Overview and Perplexity citations: a large majority of citations point to third-party sources rather than a brand's own domain, and sources that appear across several independent places are cited more often than sources that appear in only one. We treat the exact percentages from vendor studies as directional, not laboratory-grade — but the direction is consistent enough to act on.

Put together: a figure that exists in only one place, deliberately unverifiable from the outside, is not a citation magnet. It is the weakest position to be cited from.

What this means in practice

Here is the distinction we now hold, and it is the whole lesson:

For your business, non-reproducibility is the moat. A metric computed from your own data, that a competitor cannot recompute without your data, is a genuine advantage. Keep the underlying data private. That is real and worth protecting.
For the citation, verifiability is the lever. Publish the method behind that metric — the segment it covers, the sample size, the date it was measured, the public benchmark it is compared against — so a reader, or an answer engine, can see exactly how it was reached and trust it enough to quote. The raw data stays yours; the method is shown.

The mistake is hiding the method to protect the moat. You can do both: private data, public method.

The honest part

We are saying this as a company that got it wrong on its own site first. Our earlier writing leaned on "no one else can reproduce this" as if that alone earned citations. It does not. We changed our public method pages to lead with verifiability, and we are rebuilding the part of our system that publishes the method alongside every original figure.

We measure our own output the way we would measure anyone's: does it say something new, does every number check out against its source, and would a person believe a human wrote it. When a piece falls short of that, we hold it rather than publish it. This one cleared the bar because every claim above can be checked — the study is named, the pattern is sourced, and the reasoning is something you can disagree with.

What to take away

If you publish original data to win AI citations, do not stop at "we have data no one else has." Show the work. Name the method, the sample, the date, and the benchmark. Keep the rows private and make the reasoning public. The source that is easiest to verify — not the rarest — is the one that gets cited.

Last updated: June 2026. Sources: Aggarwal et al., "Generative Engine Optimization," KDD 2024; 2025–2026 corpus analyses of AI Overview and Perplexity citation patterns (directional). Method: this article was reviewed against an internal standard requiring every figure to be matched to a named source before publication.