Google-Extended
Google-Extended is assigned to the AI category by BotScope. Detection is based on the user-agent string. Current pattern:
Google-Extended
Opt-out crawler for AI training data (Bard/Gemini). If you want to exclude your content from Google AI training, block it in robots.txt: User-agent: Google-Extended\nDisallow: /. This does NOT affect regular Google Search.
AI training crawlers collect web content as training material for large language models. They bring no direct traffic — your content ends up in the model and influences future AI answers. Most honour robots.txt, some are more aggressive than search engine crawlers.
Once Google-Extended shows up in your logs, BotScope lets you analyze its crawl paths, status codes, IP distribution and temporal activity — by hour, day or week.
robots.txt directive for Google-Extended
If you don't want Google-Extended to crawl your site, add the following block to your /robots.txt. This works only for bots that honour robots.txt — malicious crawlers ignore it.
User-agent: Google-Extended Disallow: /
This page is being enriched with detail knowledge about Google-Extended.