- crawlers
- AI search
- SEO
Should you let Google-Extended train on your site?
Blocking Google-Extended will not cost you a single ranking position. So the real decision is about training Gemini, not SEO. Here is how to weigh it honestly.
The short answer
The thing most people are quietly worried about is the wrong thing. Blocking Google-Extended does not lower your rankings, does not pull you out of Google Search, and does not get you flagged for hiding something. There is no SEO penalty either way. None.
Once that fear is off the table, the question becomes much smaller and much more honest: do you want your published content used to train Google's generative AI, or not? That is a values-and-strategy call, not a technical one, and either answer is defensible.
So if you came here braced for "block it and you'll tank your traffic", relax. That is not the trade you are making.
What Google-Extended actually controls
Google-Extended (what Google-Extended is) is not a crawler. It is a token you put in your robots.txt to set a permission. It governs one thing: whether Google can use your content to train and ground its generative AI models, the Gemini family and the products built on them.
The key fact that resolves most of the confusion is this: Google-Extended is separate from Googlebot.
Googlebot (what Googlebot is) is the crawler that reads your pages and builds the search index. That index is what decides where you rank. Google-Extended sits beside it as a distinct permission layer. When you allow or disallow Google-Extended, you are answering a different question entirely, "can my words feed Gemini's training?", and saying nothing at all about whether Googlebot can keep indexing you for Search.
Two doors, two keys. Disallowing one does not lock the other.
What it does NOT control
Here is where people talk themselves into mistakes.
It does not touch your Search ranking. Your blue-link positions run off the Googlebot index. Google-Extended is not part of that machinery, so a Disallow line for it changes your rankings by exactly zero. Block it on a Friday and your traffic on Monday looks the same.
It is not a clean off-switch for "appearing in Google's AI". This is the nuance that trips up almost everyone. AI Overviews (what AI Overviews are) are assembled from the regular Search index, the same one Googlebot fills. They are not built from Google-Extended training data. So blocking Google-Extended does not remove you from AI Overviews, and allowing it does not buy you a seat there.
If your real goal is "stop showing up in Google's AI answers", Google-Extended is the wrong lever, because it governs Gemini's training and grounding, not your presence in Search-driven AI features. The only way to stay out of AI Overviews is to stay out of the Search index, which means blocking Googlebot, which means losing your rankings. That is almost never a trade worth making.
Keep the two ideas apart and the whole topic gets simple. Training is one thing. Ranking and AI Overviews are another. Google-Extended only speaks to the first.
How to decide
The decision comes down to one question: are you comfortable with your content contributing to Gemini's training set?
Leave it open if you want maximum reach. Most sites should. If your content trains and grounds Google's models, you are part of the corpus those models draw on, which is a small bet on long-term visibility inside Google's AI products. There is no downside to your SEO, and you keep the widest possible surface area. For a business that mainly wants to be found, blocking gains you very little.
Block it if you have proprietary or premium content you would rather not see absorbed. Publishers behind a paywall, research firms, anyone whose words are the product, often prefer to withhold. That is a coherent position. You give up a hard-to-measure slice of AI reach in exchange for more control over where your content ends up. You lose nothing in Search. OpenAI's GPTBot is the mirror-image version of this call, and whether blocking GPTBot hurts your SEO walks the same logic from the ChatGPT side.
Be honest with yourself about that "hard to measure" part. Nobody can show you a clean before-and-after on the visibility cost of blocking, because Gemini does not expose which sites trained it. You are choosing on principle, not on a dashboard number. That is fine, as long as you know that is what you are doing.
The mechanics are two lines either way. Edit your robots.txt (what robots.txt is) at yourdomain.com/robots.txt.
To allow training (the default if you do nothing):
User-agent: Google-Extended
Allow: /
To block training:
User-agent: Google-Extended
Disallow: /
One thing to hold onto: this directive does not touch Googlebot, so add it without fear. Your Googlebot rules, your sitemap, your rankings, all of it carries on untouched. If you want to opt out of training but stay fully indexed, leave Googlebot alone and only disallow Google-Extended.
See what you are allowing
Most robots.txt files were written years ago and never opened again. Yours might be blocking a bot you meant to allow, or silently feeding training crawlers you would rather keep out, and you would have no way to know. Your server logs settle which crawlers actually reach you in practice, and checking whether AI crawlers visit your site shows how to read them.
Rankport reads your robots.txt and shows which AI training and search bots you currently allow or block, with the trade-off behind each one spelled out in plain English. Run the robots.txt checker to see exactly where you stand, or the AI visibility checker to find out whether the answer engines can read you at all.
// related articles
- How to check if AI crawlers are actually visiting your siteWant to know if ChatGPT, Perplexity, and Claude crawl your site? Read your server logs. Here are the user-agents to grep, how to read the hits, and what nothing means.
- Does blocking GPTBot hurt your SEO?No. GPTBot is OpenAI's crawler, not Google's, so blocking it is invisible to Search. The real cost is AI visibility, not ranking. Here is how to decide.
- ChatGPT shows wrong information about your business — how to fix itChatGPT keeps getting your business facts wrong? Here is why it happens, how to tell which cause you are fighting, and the practical fixes that actually correct it.
// next step
See what AI actually reads on your site.
Free first audit. No credit card. Your Discoverability Score in under two minutes.
Run a free audit →