/
Navigation
C
Chronicles
Browse all articles
C
E
Explore
Semantic exploration
E
R
Research
Entity momentum
R
N
Nexus
Correlations & relationships
N
~
Story Arc
Topic evolution
S
Drift Map
Semantic trajectory animation
D
P
Posts
Analysis & commentary
P
Browse
@
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
?
Concept Search
Semantic similarity search
!
High Impact Stories
Top coverage by position
+
Sentiment Analysis
Positive/negative coverage
*
Anomaly Detection
Unusual coverage patterns
Analysis
vs
Rivalry Report
Compare two entities head-to-head
/\
Semantic Pivots
Narrative discontinuities
!!
Crisis Response
Event recovery patterns
Connected
Nav: C E R N
Search: /
Command: ⌘K
Embeddings: large
Entity

robots.txt

18 articles stable
Articles
18
mentions
Velocity
0.0%
growth rate
Acceleration
+0.333
velocity change
Sources
11
publications

Coverage Timeline

2025-08-12
The Verge 23 related

Reddit says it will block the Internet Archive from indexing every page but its homepage, after catching AI companies scraping its data from the Wayback Machine

it was illegally collected by AI companies Andrew Nusca / Fortune : Ford's new EV strategy includes $2 billion U.S. investment Amanda Yeo / Mashable : Reddit is blocking Wayback Machine from archiving...

2025-08-04
Cloudflare 5 related

Cloudflare says Perplexity uses stealth crawling techniques, like undeclared user agents and rotating IP addresses, to evade robots.txt rules and network blocks

We are observing stealth crawling behavior from Perplexity, an AI-powered answer engine.  Although Perplexity initially crawls …

2025-01-12
TechCrunch 4 related

OpenAI's crawlers took down e-commerce site Triplegangers by relentlessly trying to scrape the entire site, whose robots.txt file was not properly configured

techcrunch.com/2025/01/10/h...  #google #seo #openai [image] @tante.cc : #OpenAI is basically the locusts of the digital by now.  Their massive scrapers crushing websites in order to steal and feed th...

2025-01-11
TechCrunch 4 related

OpenAI's crawlers took down e-commerce site Triplegangers by relentlessly trying to scrape the entire site, whose robots.txt file was not properly configured

On Saturday, Triplegangers CEO Oleksandr Tomchuk was alerted that his company's e-commerce site was down. Bluesky: @valkayec , @glynmoody , and @tante.cc Mastodon: @remixtures@tldr.nettime.org , @DrPe...

2024-07-30
404 Media 7 related

Some popular sites like Condé Nast's titles and Reuters.com modified robots.txt to block Anthropic's bots, but Anthropic has just made new bots with other names

We really are going to need a shared blocklist that doesn't rely on putting your website behind Cloudflare.  —  https://www.404media.co/... Jason Koebler / @jasonkoebler@mastodon.social : Many website...

2024-07-25
Search Engine Land 8 related

Microsoft says “Bing stopped crawling Reddit” after Reddit updated its robots.txt file on July 1 to prohibit “all crawling of their site”

Reddit has updated its robots.txt file, preventing Bing and many other search engines from crawling the site.

2024-07-06
TechCrunch 22 related

Cloudflare launches a tool that aims to block bots from scraping websites for AI training data, available free for all its customers

“We hear clearly that customers don't want AI bots visiting their websites, and especially those that do so dishonestly.  To help, we've added a brand new one-click to block all AI bots. … X: @cloudfl...

2024-06-26
Engadget 7 related

Reddit says it will update its robots.txt to make “as clear as possible” that companies “using an automated agent to access Reddit” need to abide by its terms

The warning comes after reports that AI companies regularly ignore instructions not to scrape.

2024-06-23
Fast Company 11 related

In response to plagiarism allegations, Perplexity CEO Aravind Srinivas says the company “is not ignoring” robots.txt, but does rely on third-party web crawlers

* what we do is highly technical, you don't understand  —  * it wasn't us it was a third party service/contractor/vendor  —  https://www.fastcompany.com/ ... @bsmall2@mstdn.jp : Automated Plagiarism f...

2024-06-22
Fast Company 6 related

In response to plagiarism allegations, Perplexity CEO Aravind Srinivas says the company “is not ignoring” robots.txt, but does rely on third-party web crawlers

The AI search startup Perplexity is in hot water in the wake of a Wired investigation revealing that the startup …

Loading articles...

Quarterly Coverage

Top Sources

Narrative

Loading narrative...

Relationships

Loading graph...