/
Navigation
C
Chronicles
Browse all articles
C
E
Explore
Semantic exploration
E
R
Research
Entity momentum
R
N
Nexus
Correlations & relationships
N
~
Story Arc
Topic evolution
S
Drift Map
Semantic trajectory animation
D
P
Posts
Analysis & commentary
P
Browse
@
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
?
Concept Search
Semantic similarity search
!
High Impact Stories
Top coverage by position
+
Sentiment Analysis
Positive/negative coverage
*
Anomaly Detection
Unusual coverage patterns
Analysis
vs
Rivalry Report
Compare two entities head-to-head
/\
Semantic Pivots
Narrative discontinuities
!!
Crisis Response
Event recovery patterns
Connected
Nav: C E R N
Search: /
Command: ⌘K
Embeddings: large
Technology

Common Crawl

5 articles stable
Articles
5
mentions
Velocity
0.0%
growth rate
Acceleration
+0.667
velocity change
Sources
5
publications

Coverage Timeline

2025-11-04
The Atlantic

A profile of nonprofit Common Crawl, which has scraped billions of webpages since 2013, including paywalled ones, to build an archive used by OpenAI and others

Editor's note: This work is part of AI Watchdog, The Atlantic's ongoing investigation into the generative-AI industry. X: @kait_tiffany . Bluesky: @katienotopoulos , @damonberes.com , @justinhendrix ,...

2024-09-13
TechCrunch 8 related

The White House says Adobe, Cohere, Microsoft, Anthropic, OpenAI, and Common Crawl made voluntary commitments to combat nonconsensual image deepfakes and CSAM

The White House has announced that several major AI vendors, including OpenAI and Microsoft, have committed to taking steps …

2024-02-07
Mozilla Foundation

An in-depth look at Common Crawl, the 9.5PB web crawl archive dating back to 2008 run by a small nonprofit, its role in generative AI, its dataset, and more

Common Crawl's Impact on Generative AI  —  Common Crawl's mission: Enabling others to work like Google  —  Common Crawl's data: Machine scale analysis Mastodon: @tootbaack@mozilla.social . X: @emilybe...

2024-02-06
Bloomberg 4 related

On Meta's Q4 call, Mark Zuckerberg said Meta's next step in AI is “learning” from user data, and the dataset is larger than Common Crawl, raising privacy fears

film from 10 years ago. Zuckerberg's Plan for AI Hinges on Your Facebook and Instagram Data https://www.bloomberg.com/... @business : Facebook's path to riches has hurt many, and so might its road to ...

2024-02-02
Meta 62 related

Meta reports Q4 revenue up 25% YoY to $40.1B, net income up 201% YoY to $14B, and family daily active people up 8% YoY to 3.19B for December 2023

Meta Platforms (META Quick Quote META - Free Report) … Salvador Rodriguez / Wall Street Journal : Facebook Parent Meta Initiates Dividend as Growth Continues Jonathan Vanian / CNBC : Mark Zuckerberg s...

Loading articles...

Quarterly Coverage

Top Sources

Narrative

Loading narrative...

Relationships

Loading graph...