The Buffy Blog

How to turn query fan-out data into AI citations

A measurement-driven content method: mine the query fan-outs behind the prompts you track for recurring themes, publish focused content against them, then measure crawl-to-citation lag and double down on what gets cited. Based on a practitioner loop that grew a new domain's daily ChatGPT citations from ~200 to 1,700+.

5 min readUpdated June 17, 2026

The most reliable way to produce content AI engines cite is to let their own query fan-out tell you what to write, then measure which pages actually get cited and double down. The loop has three moves: mine the fan-outs behind the prompts you track for recurring themes, publish focused content against them, and measure crawl-to-citation lag so you keep winners and cut the rest. One practitioner reported using this loop to grow a brand-new domain's daily ChatGPT citations from about 200 to over 1,700.

A scope note up front: that 200→1,700+ figure is single-vendor and self-reported (Promptwatch's co-founder, June 2026) — directional evidence, not a guarantee. The method is what travels, and it lines up with how AI retrieval and crawling demonstrably work.

What is the query fan-out content loop?

It's a content process driven by measurement instead of intuition. Rather than brainstorming topics, you observe the sub-questions AI engines generate when answering the prompts your customers actually ask — then write to those, and let citation data decide what to scale. Stated as a subject–predicate–object: the engine's fan-out tells you the demand, your content answers it, and your crawl-and-citation tracking grades the result.

It assumes you already track a realistic prompt set. If you haven't built one, start with how to choose which prompts to track — the loop is only as good as the prompts feeding it. The aim is deliberately modest and scalable: don't "over-engineer it," just keep the loop turning.

Step 1 — Mine the query fan-outs for recurring themes

When an AI engine answers a prompt, it quietly fans that one prompt into many sub-queries and gathers passages for each (how query fan-out works). Those sub-queries are a direct readout of what the engine wants content for.

After a few days of tracking an organic prompt monitor, analyse the fan-outs across all of those prompts and look for patterns:

  • Repeated wording — the exact phrases that recur across fan-outs are the language to write in.
  • Average number of fan-outs — how many sub-questions a prompt tends to spawn, which sizes the topic.
  • Recurring sub-questions — the branches that show up again and again are your content backlog, ranked by demand.

The output of step one is a list of themes grounded in real engine behaviour, not a guess. This is the same fan-out lens the corpus describes for diagnosing where you're absent — here it's turned outward, into a content plan.

Step 2 — Publish focused content against each theme

Write one clean, self-contained piece per recurring theme, in the wording the fan-outs revealed. The craft that makes a passage liftable is covered in how to get cited by AI: answer-first chunks, facts in tables and lists, question-style headings, and specific, dated claims. The point of the loop is that you're not choosing topics in the dark — each piece targets demand the engine has already shown.

Favour focused, single-topic depth over sprawling guides. Practitioner field data has repeatedly found that deep, single-topic explainers overtake broad comparison pages on citations over time — so the fan-out themes are best answered as a set of tightly-scoped pages, each owning one branch, interlinked into a hub.

Step 3 — Measure crawl-to-citation lag, then double down

This is the step most content programmes skip, and it's where the loop earns its keep. After publishing, measure two things per page: how long it takes to get crawled by AI bots, and how long until it starts getting cited.

Expect a gap between them. Crawling comes first, indexing second, citations third — so a page can be crawled heavily for weeks while earning zero citations, which is a leading indicator, not a failure. Watching your server logs for AI crawler hits tells you a page has entered the pipeline before any citation shows up.

Signal What it means What to do
Crawled, no citations yet (weeks 1–3) Normal lag — page is in the pipeline Wait; don't judge it dead
Crawled, then cited The theme and format are working Double down — write more in that vein
Not crawled at all Access or discovery problem Check crawler access and internal links before judging content
Crawled heavily, still uncited after weeks The piece isn't earning selection Cut your losses and move on to the next theme

Once content is published, the only honest question is what got cited. Double down on the pages that win citations, stop investing in the ones that don't — and keep the loop turning. That continuous cycle is how daily citations compound.

The one caveat: only "kill" a page once you've ruled out access and parseability. A well-written piece that never gets crawled is a reachability problem, not a content one, and killing it would be learning the wrong lesson.

How long before the loop pays off?

Weeks, not days — and that's by design, because of the crawl→index→cite lag. The discipline is to let each piece clear that lag before grading it, then let citation data, not opinion, decide what you scale. Because live retrieval also favours recently-updated pages, winners need a refresh cadence to keep their citations from decaying.

Running this loop by hand — tracking prompts, analysing fan-outs, watching crawl logs, and attributing citations to specific pages over weeks — is a lot of measurement. Doing that measurement continuously, across every engine, and turning it into a prioritised content signal is exactly what Buffy Intel is built to provide.

Frequently asked

What is a query fan-out content loop?
It's a repeatable method for producing AI-citable content from tracking data: monitor a realistic set of prompts, analyse the query fan-outs they trigger to find recurring sub-questions and wording, write focused content answering those themes, then measure how long each page takes to get crawled and cited — keeping what wins and cutting what doesn't. It treats content as an experiment guided by what AI engines actually ask.
How long does it take for new content to get cited by AI?
Expect a lag of weeks, not days. A page is usually crawled by AI bots first, then enters the engine's search index, then starts appearing in answers — so heavy crawling with zero citations early on is a leading indicator, not a failure. The practical move is to measure crawl-to-citation time per page and judge results over weeks, not in week two.
Should I keep or kill content that isn't getting cited?
Give it enough time to clear the normal crawl-to-cite lag of several weeks first. After that, the practitioner approach is to double down on the topics and formats that earn citations and stop investing in the ones that don't — provided the page was genuinely reachable and well-structured, since an access or parseability problem can suppress an otherwise good piece.
say hi
Buffy the golden retriever peeking over the card
Let's go

Show upwhere shoppers are looking.

Free tier · no card required · live in five minutes. Buffy will be wagging on the other side of the install.

(or just come say hi)