What actually gets a brand cited by AI search — and why the dashboards can’t deliver it
by Ygor Fonseca, Founder & Systems Lead
Somewhere this quarter, a CEO asked why the company never shows up when ChatGPT answers questions about their category. Within a week, someone on the team was sitting in a demo for a tool that tracks brand mentions across AI platforms. The dashboard was impressive. The number was low. And the obvious next question — so what do we do to raise it? — is where the industry selling the dashboard gets noticeably quieter.
This post is about that question. What actually determines whether AI search — ChatGPT, Google’s AI Overviews, Perplexity — cites your brand when it writes an answer, and why most of what’s sold under the GEO and AEO labels (generative engine optimization, answer engine optimization — the acronym layer that grew up around AI visibility) measures the problem rather than moves it. We’ve leaned on this argument in passing before, including in our piece on AI agents as a second audience. It deserved its own post.
The fortnight that settled the first version of this argument
Through 2024 and 2025, the standard GEO prescription was technical: add schema markup, publish an llms.txt file, restructure your pages into AI-friendly chunks. That version of the playbook didn’t fade out gradually. It was taken apart in eleven days, by three unrelated sources, in May of this year.
Pedro Dias, a search practitioner who writes The Inference, made the architectural argument first: large language models read unstructured language by design — that’s the whole point of them — so prescriptions that promise better citation through markup are reasoning about an architecture that doesn’t exist. The GEO playbook, he argued, was largely SEO best practice repackaged as novelty.
Six days later, Ahrefs published a controlled study with a title that did the work: We Tracked 1,885 Pages Adding Schema. AI Citations Barely Moved. Nearly two thousand pages that added structured data, four thousand matched pages that didn’t, citations measured thirty days before and after across AI Overviews, AI Mode, and ChatGPT. No meaningful uplift on any platform. (Worth saying plainly: Ahrefs sells SEO tooling and has its own commercial angle. The methodology — matched controls, before-and-after measurement — is the part that survives the caveat, and it’s exactly the kind of test the prescriptions it falsified never ran.)
Four days after that, Google published developer guidance on AI search features and said the quiet part in documentation form: no special files, no llms.txt, no chunking, no AI-specific rewrites, and no extra schema needed to appear in generative AI search — Google still recommends structured data for classical rich results, but is explicit that it isn’t the AI-citation lever. Their own words: “From Google Search’s perspective, optimizing for generative AI search is optimizing for the search experience, and thus still SEO.” (Our own site ships an llms.txt file, for what it’s worth — it costs nothing to keep, and we would never bill it as the reason anyone gets cited.)
First principles, controlled measurement, platform vendor. Three independent channels, eleven days, one answer.
What the vendor layer sells now
Here’s the part we wanted to check before writing this, because the receipts above are a month old and this market moves fast. So this morning we pulled the current positioning of nine GEO/AEO vendors — hero copy, subheads, product descriptions, fetched today. The full table, verbatim and named, is in the appendix below, so you can check our read against the copy.
The schema checklists are mostly gone from the front pages. What’s there instead: citation dashboards, prompt tracking, share-of-voice scores, bot-crawl analytics, “marketing agents” that produce AI-ready content at volume. One vendor now sells a parallel, stripped-down version of your website served separately to AI crawlers. Seven of the nine sites lead entirely with measurement and machine-readability mechanics; the other two mix in content advice below the fold. None of the nine leads with the thing the evidence says drives citations.
To be fair about what these products are: tracking is honest work. Knowing how often AI assistants mention your brand, and in what sentiment, is a real measurement problem, and a dashboard that answers it is useful the way any analytics product is useful. The overclaim isn’t the thermometer. It’s the implied promise that the thermometer, plus some plumbing around your content, raises the temperature.
The strongest version of the pitch comes less from the tool layer than from the biggest marketing voices, and it deserves a fair statement: GEO as a second layer on top of SEO rather than a replacement — classical SEO earns the authority signals that make a model trust your domain, and the new layer shapes your content to fit the answers models write. The first half of that is solid; it’s the platform position restated, and it matches the retrieval mechanics below. The second half splits in two. Shaping content for answers, in the sense of clear, extractable, evidence-rich writing, is exactly what the surviving research supports — that part is real, and free. The paid version — the campaign playbook that promises citation outcomes from answer-formatting — has yet to publish a controlled test that clears the bar the schema era failed.
There’s a structural reason to be skeptical of that promise, and it comes from the people who build the models rather than the people who sell visibility into them. Anthropic’s own interpretability researchers describe the situation bluntly: “We mostly treat AI models as a black box: something goes in and a response comes out, and it’s not clear why the model gave that particular response instead of another.” The builders publish hedged language about what can be guaranteed; parts of the optimization industry publish confidence. Dias has a sharp line about this gradient — the closer you get to the model, the less certainty people claim — and it’s a useful filter for any vendor conversation: ask what they’d accept as evidence that their lever doesn’t work.
What the evidence says actually moves citations
Strip away the falsified prescriptions and the genuinely useful research left standing is fairly consistent.
Start with how the systems work, in Google’s own description: when AI search composes an answer, it runs searches — including a fan of related queries beyond the one the user typed — retrieves pages through the same index and ranking systems as classical search, reads them, and writes a response with links. Two things follow immediately. Your page has to be retrievable at all, which is the boring classical-SEO floor: indexed, crawlable, eligible. And once retrieved, your page is competing on what a reading system can extract from it.
That second competition is where the real research points. The academic paper the GEO acronym was originally borrowed from tested nine content-level techniques across ten thousand queries, and the ones that worked were almost embarrassingly editorial: adding real quotations, adding statistics, citing sources — each lifting visibility in generated answers by roughly a third against baseline. Keyword stuffing, the classic SEO-era carryover, performed below baseline. The paper never tested schema at all, a fact the vendor layer borrowing its acronym tended to skip. Its authors put the era in one line: techniques effective in search engines “may not translate to success in this new paradigm.”
A measurement preprint released this spring — not yet peer-reviewed, but built on 602 controlled prompts and twenty-one thousand citations — sharpened the picture by splitting being cited from being absorbed: whether the model merely links a page or actually draws language and facts from it. The pages that get absorbed are “richer in extractable evidence such as definitions, numerical facts, comparisons, and procedural steps.”
Call the common thread evidence density: how much checkable, extractable substance — numbers, named methods, first-hand observations, real quotes, working definitions — a page offers per screen of text. A system assembling an answer from retrieved pages keeps reaching for the page that hands it usable evidence, and keeps skipping the page that restates what ten other pages already say.
Which is also why this lever can’t be commoditized the way markup could. Lily Ray, a sixteen-year SEO veteran who now runs an AI-search consultancy, said it cleanly in an interview we keep coming back to: a business putting out “original research, original ideas, thought leadership, original data” owns something the engines can’t commoditize — “they’re not going to take that away from you.” Her advice is to invest in the things that are yours, the things that are proprietary. Anyone can add schema. Only you can publish your own numbers.
Where we’d put the budget
If a brand asked us to translate all of this into a spending decision, it would look like this.
Keep a thermometer if you want one — measurement has real value, and the tracking products do that job. Hold the plumbing to the platform’s own published bar, which is currently: none required. And put the recurring effort where every surviving piece of evidence points — into raising the evidence density of the pages you already publish. Your own data, even small. Named methods instead of vague claims. Numbers from your operations that no competitor can copy. Quotes from people who did the work. This is content your team actually has to produce rather than a setting anyone can toggle, which is exactly why it compounds and checklists don’t.
That answer is less convenient than a dashboard with an action plan. It’s also the one that three unrelated sources converged on in eleven days, and the one the vendor pivot quietly concedes — when the markup era got falsified, the industry moved to selling measurement, not to disproving the measurements.
The full nine-vendor sample is published below, verbatim and dated. If you’re seeing different positioning, or you’ve run a controlled test of your own that cuts either way, the comments are open. Evidence density applies to us too.
Appendix — the sample, verbatim (fetched 2026-06-05)
Hero copy and one product description per vendor, quoted as retrieved on the morning of June 5, 2026. We originally pulled ten; one vendor’s page wouldn’t render outside a browser, so rather than quote what we couldn’t read, we cut the row. One remaining page renders client-side and returned only its server-side title; that row is marked, and its quote is corroborated on the vendor’s fully rendered tool page. Positioning changes fast — treat this as a dated snapshot, and tell us if a row is stale. The classifications are ours; the copy is theirs.
| Vendor | Hero / headline (verbatim) | One product line (verbatim) | Leads with |
|---|---|---|---|
| Semrush | “Semrush AI Visibility | Win Every Search. From Traditional SEO to AI Discovery” (title tag; page renders client-side) | “We submit millions of prompts to ChatGPT, SearchGPT, Gemini, and Google search, then track whether your brand is mentioned in the responses.” | Tracking, with content and backlink advice below the fold |
| AirOps | “The growth platform for AI Search, Google, Gemini, Perplexity, Claude, ChatGPT” | “Get citation tracking, competitor intelligence, and share of voice across every AI search platform so you always know your next move.” | Tracking, with content-velocity production behind it |
| Peec AI | “AI search analytics for marketing teams” | “We’ve set up tracking for the most important metrics within AI search.” | Measurement |
| Profound | “Marketing agents to win in Perplexity / ChatGPT / Claude…” | “Track how your site is interpreted and crawled by ChatGPT, Gemini, Claude, Perplexity, and more.” | Measurement + agents |
| Otterly | “Turn our AI Search Insights into your GEO services your clients will love” | “…automatically track brand mentions and website citations on Google AI Overviews, ChatGPT, Perplexity, Google AI Mode, Gemini, and Copilot.” | Measurement + technical tooling (“GEO Content Check,” “AI Crawler Simulation”) |
| Scrunch | “Monitor and improve your brand’s visibility in AI search” | “AXP creates a parallel, lightweight version of your site that’s translated for AI agents—preserving meaning while stripping noise and bloat.” | Machine-readability plumbing (the parallel-site product described in the body) |
| BrightEdge | “Discover the Power of AI Catalyst for SEO” | “Track brand visibility and sentiment across: Google AI Overviews, ChatGPT, Perplexity.” | Measurement |
| Writesonic | “AI is recommending your competitors. See exactly where and why.” (their GEO-named page now redirects here) | “Get a prioritized action plan. Content gaps, citation opportunities, technical issues. Ranked by impact.” | Measurement |
| Conductor | “Where Enterprises Go to Win AI Search” | “24/7 always-on monitoring tracks how AI bots crawl your site. Real-time alerts and prioritized fixes so visibility issues never become revenue problems.” | Measurement + bot-crawl monitoring |
Tally as we read it: seven of nine lead entirely with measurement or machine-readability mechanics, two mix in content work below the fold, and none leads with original evidence, quotes, statistics, or first-hand expertise as the citation lever. Run your own count — the links are above.