Navigation menu

← All articles
  • AEO
  • AI search
  • Google

How AI Overviews and AI Mode pick sources

Google's AI features do not run a secret beauty contest. Here is how AI Overviews and AI Mode actually choose which pages to quote and cite, in plain English.


The question everyone gets wrong

When a Google AI Overview writes a paragraph and links to four sites, the obvious question is: how did those four sites get picked, and how do I become one of them?

Most answers you will read assume there is a separate machine making the choice. A special AI index. A hidden ranking signal. A markup file you forgot to add. There isn't. The sources come from the same search index that has always decided what shows up for a query. Understanding the two pieces of plumbing behind that makes the whole thing far less mysterious, and tells you exactly what to do about it.

The two pieces are called retrieval-augmented generation and query fan-out. Both sound like engineering jargon. Both are simple once you see what they are doing.

Retrieval-augmented generation, without the jargon

A language model on its own is a confident guesser. Ask it a question and it produces an answer from what it absorbed during training, which means it can be out of date, or just wrong, with total fluency.

Retrieval-augmented generation (RAG) fixes that by giving the model homework before it answers. Instead of writing from memory, the system first goes and retrieves real documents, then asks the model to write its answer grounded in those documents. The "retrieval" is a lookup. The "generation" is the writing. The augmentation is the documents in between.

For Google's AI features, the retrieval step is a search of the core index. The same index that powers blue links. Google is explicit about this: AI Overviews and AI Mode are built on top of Search, drawing from the regular index rather than some parallel AI corpus, as laid out in its AI features optimization guidance.

So the chain is: your page gets crawled, it gets indexed, and when a relevant query fires, it becomes a candidate document the model can retrieve and quote. If you are not in the index, you cannot be retrieved. If you cannot be retrieved, you cannot be cited. Everything downstream depends on that first link in the chain.

Query fan-out: one question becomes many

Here is the part that explains why AI answers feel broader than a single search result.

When you type a question into AI Mode, Google does not run that one query and stop. It uses a technique called query fan-out: it generates a cluster of related sub-queries and fires them concurrently, then gathers candidate sources across all of them.

Say you ask "is a standing desk worth it for back pain". Behind the scenes the fan-out might also search for the health evidence on standing while working, the downsides of standing too long, how sit-stand desks compare, and what physiotherapists recommend. Each of those mini-searches pulls its own set of pages from the index. The AI answer is stitched together from the strongest sources across the whole fan.

The practical consequence: you are not competing for one keyword anymore. You are competing to be the clearest, most retrievable answer to any of the sub-questions hiding inside a broader one. A page that cleanly answers one specific facet, with that facet stated plainly in a heading and resolved in the paragraph below it, is exactly the kind of source a fan-out branch lands on.

So what actually makes you eligible

Strip away the jargon and eligibility comes down to three plain conditions. None of them are new.

  • You are indexed. Google has to be able to crawl the page and add it to the index. Check with the site: operator (site:yourdomain.com/your-page) to confirm a page is in there. If it is not indexed, none of the rest matters.
  • You are helpful. The page genuinely answers the question a real person asked, written for people rather than to game a ranking. This is the same who/how/why standard Google applies to all content.
  • You are well-structured. Headings that mirror the questions people ask, real selectable text instead of images of text, direct answers placed right under the question. This is what makes a page easy to retrieve and easy to quote.

That is the whole eligibility test. It is SEO. Google has said as much: for its AI features, optimizing is not a separate discipline, it is being indexable, helpful, and clearly structured. Our guide to getting found by AI search walks through how to put each of these in place.

The tricks that do not work

Because the mechanism is misunderstood, a small industry of bad advice has grown around it. Worth naming the things Google has explicitly said not to do for its AI features:

  • Do not build an llms.txt file for Google. It does not use one. (The wider ecosystem is a different story, which is why Rankport still checks for it, but never claim it helps Google.)
  • Do not chop your content into tiny fragments in the hope that AI prefers bite-sized chunks. It does not.
  • Do not rewrite your pages "for AI" as a separate version. Write well once.
  • Do not chase inauthentic mentions or manufactured citations.
  • Do not treat structured data as a magic requirement. It helps machines parse you; it is not an entry ticket to AI answers.
  • Do not spin up a page per query variation. That is scaled content abuse, and it is a way to get penalised, not retrieved.

The common thread: every one of these is an attempt to talk to the AI directly instead of just being a good source. The retrieval step does not reward signalling. It rewards being the document that best answers the sub-query it ran.

The short version

AI Overviews and AI Mode retrieve real pages from Google's normal index (that is RAG), and they do it across a spray of related sub-questions at once (that is query fan-out). You earn your way into the result by being indexed, being genuinely useful, and being structured so a machine can find and lift the answer.

No secret file. No separate AI version of your site. The thing that makes you eligible for an AI citation is the same thing that has always made a page worth ranking: it is the clear, trustworthy answer to a real question, and Google can read it.

// next step

See what AI actually reads on your site.

Free first audit. No credit card. Your Legibility Score in under two minutes.

Run a free audit →