Crawlability

Crawlability shows you which AI bots are allowed or blocked by a site’s robots.txt file. Select a tracked domain and get results instantly with no setup or account connection required. Peec checks the domain’s robots.txt against 40+ AI bots from 20+ vendors, then shows you the status for each. We categorize bots based on publicly available data and their stated purpose. As real-world behavior becomes clearer, categories may be refined to ensure accuracy.

Crawlability table

The table breaks down the status for each individual bot:

Bot: the user-agent identifier (e.g., GPTBot, ClaudeBot).
Platform: the AI vendor behind the bot (e.g., OpenAI, Anthropic, Google)
Bot type: Training, Search, User Query, and Other
Status: Allowed, Partial, or Blocked
Reason: How the status was determined: explicit rules for this bot, or inherited from the global wildcard (*) rules

Use the in-table search bar to find a specific bot, or filter by platform, bot type or status using the filters at the top.

URL Tester

Here you can enter any URL to see which AI bots are allowed or blocked by your domain’s robots.txt rules. Simply choose a URL on your domain to analyze and see which bots are allowed or blocked from crawling it. You can then use this insight to decide whether allowing different bots to crawl that URL is beneficial.

Interpreting Crawlability

If a bot is blocked, it can’t access your content. This means it can’t use your site as a source in its responses. Use Crawlability to:

Catch accidental blocking before it affects your AI visibility
Understand which AI ecosystems can and can’t access your content
Verify that changes to your robots.txt are working as expected

Bots

The Bots shows you which particular bots from which vendor are accessing and visiting your pages, and under which type.

AI bot	Platform	Type	Purpose / Note
YouBot	You.com	Other	Fetches pages to power You.com’s AI search results.
omgili	Webz.io	Training	Forum and discussion crawler for structured dataset building.
Perplexity-User	Perplexity	User Query	Used during a user’s Deep Research session.
Amazonbot	Amazon	Training	General training for Titan/Olympus models.
Google-Agent	Google	User Query	Used by Google agents to navigate the web and perform actions upon user request (e.g. Project Mariner).
cohere-training-data-crawler	Cohere	Training	Specialized crawler for raw training data.
ClaudeBot	Claude (Anthropic)	Training	Official training bot for Anthropic models.
Gemini-Deep-Research	Google	User Query	High-intensity agent for user-requested research.
Google-CloudVertexBot	Google	Search	Crawling for Google Cloud Vertex AI services.
Google-Extended	Google	Training	Opt-out token for Gemini training and AI product improvement.
PanguBot	PanGu (Huawei)	Training	Training for Huawei’s Pangu models.
ChatGPT-User	ChatGPT (OpenAI)	User Query	Visits links directly provided by a user.
CCBot	Common Crawl	Training	Massive open-source web archive for AI labs.
GrokBot	Grok (xAI)	Training	Real-time web search and training for Grok 3/4 models.
DuckAssistBot	DuckDuckGo	User Query	Summarizes pages for DuckDuckGo’s AI responses.
omgilibot	Webz.io	Other	Forum-specific crawler variant. Commercial data product.
Diffbot	Diffbot	Training	Structured data extraction as a service.
GoogleAgent-Mariner	Google	User Query	Action Agent: Can fill forms and click buttons.
TikTokSpider	ByteDance	Other	Specialized scraper for TikTok’s AI data.
Webzio-Extended	Webz.io	Training	Large-scale data scraping for AI providers.
Bytespider	ByteDance	Training	Training for TikTok and ByteDance AI.
Applebot-Extended	Apple	Training	Used for training Apple’s generative features.
OAI-SearchBot	ChatGPT (OpenAI)	Search	Real-time retriever for ChatGPT answers.
DeepSeekBot	DeepSeek	Training	Training for the DeepSeek model series.
PerplexityBot	Perplexity	Search	Fact-checking and retrieval for Perplexity.
Claude-Web	Claude (Anthropic)	Other	Legacy bot for web browsing during Claude interactions.
Grok-DeepSearch	Grok (xAI)	Search	Real-time web search for Grok’s deep research feature.
Ai2Bot-Dolma	Allen Institute	Training	Specifically builds the Dolma open dataset.
Manus-User	Meta	User Query	Action Agent: Navigates and interacts with sites.
FacebookBot	Meta	Training	Web crawler used by Meta for AI training data collection.
AzureAI-SearchBot	Microsoft	Search	Web retrieval for Azure AI services.
xAI-Grok	Grok (xAI)	Search	General-purpose web search bot for xAI/Grok.
Timpibot	Timpi	Training	Decentralized search engine for AI.
Claude-SearchBot	Claude (Anthropic)	Search	Anthropic’s specific bot for its search features.
MistralAI-User	Mistral	User Query	On-demand browser for Mistral users.
Claude-User	Claude (Anthropic)	User Query	Triggered when a user prompts with a specific link.
Amzn-SearchBot	Amazon	Search	Search bot for Amazon’s AI shopping features.
MyCentralAIScraperBot	Unknown	Other	Centralized AI data collection tool.
GPTBot	ChatGPT (OpenAI)	Training	Primary crawler for foundational training.
anthropic-ai	Claude (Anthropic)	Training	General data collection and model training.
meta-webindexer	Meta	Search	Search indexing for Meta’s AI assistants.
NovaAct	Amazon	User Query	Agent for automated web-based workflows.
meta-externalfetcher	Meta	User Query	Used for real-time link expansion on Meta.
CloudVertexBot	Google	Training	Cloud-based AI deployment and indexing.
Ai2Bot	Allen Institute	Training	General-purpose web crawler for Allen Institute AI research.
Meta-ExternalAgent	Meta	Training	High-velocity training crawler for Llama.
quillbot.com	QuillBot	User Query	Fetches content to power QuillBot’s AI writing tools.
Applebot	Apple	Search	Gathers data to power Spotlight, Siri, and Safari search functionality.
cohere-ai	Cohere	Training	Training for enterprise-grade LLMs.

Get Started

Set up Your Project

Interpret Your Results

Your project

Take Action

Agent Analytics

Misc

Integrations

Crawlability table

URL Tester

Interpreting Crawlability

Bots

Get Started

Set up Your Project

Interpret Your Results

Your project

Take Action

Agent Analytics

Misc

Integrations

Documentation Index

​Crawlability table

​URL Tester

​Interpreting Crawlability

​Bots

Crawlability table

URL Tester

Interpreting Crawlability

Bots