# AllThingsAI.work robots.txt # Policy: Allow citing/indexing crawlers; block training-only crawlers. # Rationale: AI citation bots (PerplexityBot, OAI-SearchBot) drive referral # traffic and surface the site in AI-assisted search. Training-only crawlers # (GPTBot, ClaudeBot, CCBot, Google-Extended) consume content without # attribution. We allow the search-serving variants and block the # pure-training variants per operator preference. # Traditional search engines - full access User-agent: Googlebot Allow: / User-agent: Bingbot Allow: / User-agent: Slurp Allow: / User-agent: DuckDuckBot Allow: / # AI search / citation bots - full access (these drive referral traffic) User-agent: PerplexityBot Allow: / User-agent: OAI-SearchBot Allow: / User-agent: YouBot Allow: / # OpenAI GPTBot - training-only, blocked User-agent: GPTBot Disallow: / # Anthropic ClaudeBot - training crawler, no search product User-agent: anthropic-ai Disallow: / User-agent: ClaudeBot Disallow: / # Google-Extended - Gemini/Bard training; search handled by Googlebot above User-agent: Google-Extended Disallow: / # Common Crawl - dataset builder, no attribution User-agent: CCBot Disallow: / # Bytespider (ByteDance/TikTok) - training crawler User-agent: Bytespider Disallow: / # Meta external agent - training crawler User-agent: meta-externalagent Disallow: / # Apple AI training - block, allow regular Applebot for search User-agent: Applebot-Extended Disallow: / # Amazon Alexa training User-agent: Amazonbot Disallow: / # Default - allow everything not explicitly addressed above User-agent: * Allow: / Disallow: /admin/ Sitemap: https://allthingsai.work/sitemap-index.xml