Robots.txt for theathletic.com
# New York Times content is made available for your personal, non-commercial # use subject to our Terms of Service here: # https://help.nytimes.com/hc/en-us/articles/115014893428-Terms-of-Service. # Use of any device, tool, or process designed to data mine or scrape the content # using automated means is prohibited without prior written permission from # The New York Times Company. Prohibited uses include but are not limited to: # (1) text and data mining activities under Art. 4 of the EU Directive on Copyright in # the Digital Single Market; # (2) the development of any software, machine learning, artificial intelligence (AI), # and/or large language models (LLMs); # (3) creating or providing archived or cached data sets containing our content to others; and/or # (4) any commercial purposes. # Contact https://nytlicensing.com/contact/ for assistance. # General Rules User-agent: * User-agent: Googlebot Disallow: /wp/wp-admin/ Allow: /wp/wp-admin/admin-ajax.php Disallow: /async-* Disallow: /search/* Allow: /search/$ Disallow: /checkout/ Disallow: /checkout?plan_id* Allow: /checkout/$ Disallow: /checkout2* Disallow: /login/ Disallow: /login?login_source* Disallow: /login?ref_page* Allow: /login/$ Disallow: /login2/ Disallow: /login2?login_source* Disallow: /login2?ref_page* Allow: /login2/$ Disallow: /report/ Disallow: /*/discuss/* Disallow: /register/ Disallow: /register?welcome_redirect* Disallow: /register2/ Disallow: /register2?welcome_redirect* Disallow: /betmgm-redirect* Disallow: /cdn-cgi/ Disallow: /verizon/* Disallow: /forgot-password/* Allow: /forgot-password/$ Disallow: /forgot-password2/* Allow: /forgot-password2/$ Disallow: /amp-social-login* Disallow: /track-analytics/ Disallow: /amp-auth/ Disallow: /rss-feed/ Disallow: /global-color-test.php Disallow: /global-font-test.php Disallow: /graphql* Disallow: /api* Disallow: /ip* Disallow: /call-set-cookie-with-context/* Disallow: /get-current-user/ Disallow: /pv.json Disallow: /following-feed-test/* Disallow: */boxscore/* Disallow: /feed-test/ Disallow: /migrate/ # misc bots User-agent: Amazonbot Disallow: / User-agent: anthropic-ai Disallow: / User-agent: AwarioRssBot User-agent: AwarioSmartBot Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: ClaudeBot Disallow: / User-agent: Claude-Web Disallow: / User-agent: cohere-ai Disallow: / User-agent: DataForSeoBot Disallow: / User-agent: Diffbot Disallow: / User-agent: FacebookBot Disallow: / User-agent: Google-Extended Disallow: / User-agent: GPTBot Disallow: / User-agent: magpie-crawler Disallow: / User-agent: NewsNow Disallow: / User-agent: news-please Disallow: / User-agent: omgili Disallow: / User-agent: omgilibot Disallow: / User-agent: peer39_crawler User-agent: peer39_crawler/1.0 Disallow: / User-agent: PerplexityBot Disallow: / User-agent: Scrapy Disallow: / User-agent: TurnitinBot Disallow: / # Googlebot Specific Rules User-agent: Googlebot Disallow: *adgroupid* Disallow: *campaignid* Disallow: *ad_id* Disallow: *access_token* Disallow: *amp_reader_id* Disallow: */?source=* Disallow: /live-blogs/*embed=1*