Robots.txt for huffpost.com
# Cambria robots User-agent: grapeshot Disallow: /member Disallow: /*?*err_code=404 Disallow: /search Disallow: /search/?* User-agent: * Crawl-delay: 4 Disallow: /*?*page= Disallow: /member Disallow: /*?*err_code=404 Disallow: /search Disallow: /search/?* Disallow: /mapi/v4/*/user/* Disallow: /embed User-agent: Amazonbot Disallow: / User-agent: anthropic-ai Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: CCBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: Claude-Web Disallow: / User-agent: ClaudeBot Disallow: / User-agent: cohere-ai Disallow: / User-agent: Diffbot Disallow: / User-agent: FacebookBot Disallow: / User-agent: google-extended Disallow: / User-agent: Google-Extended Disallow: / User-agent: Googlebot Allow: / Disallow: /*?*err_code=404 Disallow: /search Disallow: /search/?* User-agent: GPTBot Disallow: / User-agent: magpie-crawler Disallow: / User-agent: PerplexityBot Disallow: / User-agent: TurnitinBot Disallow: / # archives Sitemap: https://www.huffpost.com/sitemaps/sitemap-v1.xml Sitemap: https://www.huffpost.com/sitemaps/sitemap-google-news.xml Sitemap: https://www.huffpost.com/sitemaps/sitemap-google-video.xml Sitemap: https://www.huffpost.com/sitemaps/sections.xml # huffingtonpost.com archive sitemaps Sitemap: https://www.huffpost.com/sitemaps-huffingtonpost/sitemap.xml Sitemap: https://www.huffpost.com/sitemaps-huffingtonpost/sections.xml