Free tools

Robots.txt Examples

Robots.txt file content for theglobeandmail.com.

Robot.txt file for: theglobeandmail.com

      User-agent: Googlebot-News
Disallow: /feeds/
Disallow: /incoming/
Disallow: /test/
Disallow: /partners/
Disallow: /search/
Disallow: /business/adv/appointmentnotices/search/

User-agent: AdsBot-Google
Disallow: /feeds/
Disallow: /incoming/
Disallow: /test/
Disallow: /search/
Disallow: /business/adv/appointmentnotices/search/

User-agent: *
Disallow: /feeds/
Disallow: /incoming/
Disallow: /test/
Disallow: /search/
Disallow: /business/adv/appointmentnotices/search/
Disallow: /marketing-containers/
Disallow: /coupons/
Disallow: /files/advertising/

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: Google-Extended
Disallow: / 

User-agent: CCBot
Disallow: /

User-agent: Anthropic-ai
Disallow: /

User-agent: Claude-Web
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-Agent: PerplexityBot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: AwarioRssBot
User-agent: AwarioSmartBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: DataForSeoBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: FacebookBot
Disallow: /

User-agent: magpie-crawler
Disallow: /

User-agent: NewsNow
Disallow: /

User-agent: news-please
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: peer39_crawler
User-agent: peer39_crawler/1.0
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Scrapy
Disallow: /

User-agent: TurnitinBot
Disallow: /

Sitemap: https://www.theglobeandmail.com/sitemap.xml?outputType=xml