/
Navigation
C
Chronicles
Browse all articles
C
E
Explore
Semantic exploration
E
R
Research
Entity momentum
R
N
Nexus
Correlations & relationships
N
~
Story Arc
Topic evolution
S
Drift Map
Semantic trajectory animation
D
P
Posts
Analysis & commentary
P
Browse
@
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
?
Concept Search
Semantic similarity search
!
High Impact Stories
Top coverage by position
+
Sentiment Analysis
Positive/negative coverage
*
Anomaly Detection
Unusual coverage patterns
Analysis
vs
Rivalry Report
Compare two entities head-to-head
/\
Semantic Pivots
Narrative discontinuities
!!
Crisis Response
Event recovery patterns
Connected
Nav: C E R N
Search: /
Command: ⌘K
Embeddings: large
VOICE ARCHIVE

Elie

@eliebakouch
24 posts
2026-02-14
ok this is very interesting, this is not the same perf than gpt5.3, and might not be the same arch as well? > Codex-Spark marks the first milestone in our partnership with Cerebras. Codex-Spark is optimized to feel near-instant when served on ultra-low latency hardware (from [image]
2026-02-14 View on X
ZDNET

OpenAI debuts a research preview of GPT-5.3-Codex-Spark, a smaller version of GPT-5.3-Codex that it claims generates code 15 times faster, for ChatGPT Pro users

2026-02-13
wtf, minimax M2.5 benchmark are insane and it's probably the same base model so only 10B active parameters??? [image]
2026-02-13 View on X
MiniMax

MiniMax releases M2.5, claiming the model delivers on the “intelligence too cheap to meter” promise, priced at $0.30/1M input tokens and $1.20/1M output tokens

Today we're introducing our latest model, MiniMax-M2.5.  —  Extensively trained with reinforcement learning …

ok this is very interesting, this is not the same perf than gpt5.3, and might not be the same arch as well? > Codex-Spark marks the first milestone in our partnership with Cerebras. Codex-Spark is optimized to feel near-instant when served on ultra-low latency hardware (from [image]
2026-02-13 View on X
ZDNET

OpenAI debuts a research preview of GPT-5.3-Codex-Spark, a smaller version of GPT-5.3-Codex that it claims generates code 15 times faster, for ChatGPT Pro users

ZDNET's key takeaways  — OpenAI targets “conversational” coding, not slow batch-style agents.  — Big latency wins: 80% faster roundtrip, 50% faster time-to-first-token.

2026-02-12
wtf, minimax M2.5 benchmark are insane and it's probably the same base model so only 10B active parameters??? [image]
2026-02-12 View on X
MiniMax

MiniMax releases M2.5, claiming the model delivers on the “intelligence too cheap to meter” promise, priced at $0.30/1M input tokens and $1.20/1M output tokens

Today we're introducing our latest model, MiniMax-M2.5.  —  Extensively trained with reinforcement learning …

GLM-5 is out, amazing release with very very good benchmark scores even on tasks like @andonlabs vending bench 2 i think one of the most crazy parts of this is that the RL framework that they use is open (based on megatron for training, @sgl_project for inference), it's somewhat [image]
2026-02-12 View on X
Z.ai

Z.ai launches GLM-5, saying its flagship open-weight model has “best-in-class performance among all open-source models” in reasoning, coding, and agentic tasks

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks.  Scaling is still one of the most important ways …

GLM-5 is out, amazing release with very very good benchmark scores even on tasks like @andonlabs vending bench 2 i think one of the most crazy parts of this is that the RL framework that they use is open (based on megatron for training, @sgl_project for inference), it's somewhat [image]
2026-02-12 View on X
Reuters

Z.ai says it will raise prices by at least 30% for new GLM coding plan subscribers to accommodate surging demand for its AI coding tools

ok this is very interesting, this is not the same perf than gpt5.3, and might not be the same arch as well? > Codex-Spark marks the first milestone in our partnership with Cerebras. Codex-Spark is optimized to feel near-instant when served on ultra-low latency hardware (from [image]
2026-02-12 View on X
ZDNET

OpenAI launches a research preview of GPT-5.3-Codex-Spark, a smaller version of GPT-5.3-Codex that it claims generates code 15 times faster, for Pro users

ZDNET's key takeaways  — OpenAI targets “conversational” coding, not slow batch-style agents.  — Big latency wins: 80% faster roundtrip, 50% faster time-to-first-token.

2026-02-11
GLM-5 is out, amazing release with very very good benchmark scores even on tasks like @andonlabs vending bench 2 i think one of the most crazy parts of this is that the RL framework that they use is open (based on megatron for training, @sgl_project for inference), it's somewhat [image]
2026-02-11 View on X
Z.ai

Z.ai launches GLM-5, its flagship open-weight model, saying it has best-in-class performance among open-source models in reasoning, coding, and agentic tasks

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks.  Scaling is still one of the most important ways …

2026-02-10
GLM 5 is 2x the total parameter of GLM 4.5 + deepseek sparse attention for efficient long context this is going to be a crazy model [image]
2026-02-10 View on X
Nikkei Asia

Alibaba and Tencent are releasing new models and spending millions on “red envelope” freebies to woo users ahead of the Lunar New Year

HONG KONG — China's biggest AI companies are releasing new models and handing out “red envelope” …

GLM 5 is 2x the total parameter of GLM 4.5 + deepseek sparse attention for efficient long context this is going to be a crazy model [image]
2026-02-10 View on X
The Information

Source: Chinese AI startup Zhipu anonymously released its new AI model GLM-5 on OpenRouter under the name Pony Alpha; Zhipu plans to debut GLM-5 later this week

Zhipu, one of China's prominent AI developers, has anonymously released its new large language model under a different name on OpenRouter …

2026-01-27
Kimi K2.5 is NOT just a small iteration on top of k2, it's now have fully multimodal understanding INCLUDING video! [image]
2026-01-27 View on X
Kimi

Moonshot says Kimi K2.5 builds on K2 with “pretraining over ~15T mixed visual and text tokens” and “can self-direct an agent swarm with up to 100 sub-agents”

Today, we are introducing Kimi K2.5, the most powerful open-source model to date.

very nice release by the kimi team, benchmarks are on par with opus 4.5, gpt 5.2 xhigh, gemini 3.0 pro there is also some nice details on the parallel RL part in the tech blog explaining how they build K2.5 agent swarm [image]
2026-01-27 View on X
Bloomberg

Chinese startup Moonshot releases Kimi K2.5, saying the model can process text, images, and videos simultaneously and beats its open-source peers in some tests

Alibaba Group Holding Ltd.-backed Moonshot AI released an upgrade of its flagship model, heating up a domestic arms race ahead …

Kimi K2.5 is NOT just a small iteration on top of k2, it's now have fully multimodal understanding INCLUDING video! [image]
2026-01-27 View on X
Bloomberg

Chinese startup Moonshot releases Kimi K2.5, saying the model can process text, images, and videos simultaneously and beats its open-source peers in some tests

Alibaba Group Holding Ltd.-backed Moonshot AI released an upgrade of its flagship model, heating up a domestic arms race ahead …

very nice release by the kimi team, benchmarks are on par with opus 4.5, gpt 5.2 xhigh, gemini 3.0 pro there is also some nice details on the parallel RL part in the tech blog explaining how they build K2.5 agent swarm [image]
2026-01-27 View on X
Kimi

Moonshot says Kimi K2.5 builds on K2 with “pretraining over ~15T mixed visual and text tokens” and “can self-direct an agent swarm with up to 100 sub-agents”

Today, we are introducing Kimi K2.5, the most powerful open-source model to date.

2025-12-23
the gap in design taste and vibe coding ability between GLM 4.6 and GLM 4.7 is impressive (see the blog for more examples), seems to be the main focus of this release expecting minimax M2.1 to focus on the same thing so it's going to be interesting! [image]
2025-12-23 View on X
Z.ai

Chinese AI startup Z.ai releases GLM-4.7, an open-weight model that Z.ai says delivers significant improvements in coding performance compared to GLM-4.6

like 210  —  Z.ai 6.24k  —  Text Generation Transformers Safetensors English Chinese glm4_moe conversational eWeek : Chinese AI Startup Z.ai Takes On OpenAI Via Cheaper Prices Vinc...

2025-12-01
very interesting table from deepseek v3.2 that compares the output token count on different benchmarks, dsv3.2 speciale version thinks much more than any other model, BUT since they are using sparse attention the inference cost will still be ok? [image]
2025-12-01 View on X
Bloomberg

DeepSeek releases DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which it calls “reasoning-first models built for agents”, after releasing V3.2-Exp in September

China's DeepSeek unveiled two new versions of an experimental artificial-intelligence model it released weeks ago …

2025-11-30
deepseek math v2 is the first open source model to reach gold on IMO? and we get a tech report, what an amazing release [image]
2025-11-30 View on X
The Decoder

DeepSeek says its new DeepSeekMath-V2 model got gold-medal level status on the International Mathematical Olympiad 2025 and Chinese Mathematical Olympiad 2024

where models prove formal mathematical theorems—GPT-5 scores 20%.  Gemini Deep Think IMO Gold hits 65.7%.  DeepSeek Math V2 (Heavy) scores 61.9%.  That's second place—but Gemini is...

2025-11-29
deepseek math v2 is the first open source model to reach gold on IMO? and we get a tech report, what an amazing release [image]
2025-11-29 View on X
The Decoder

DeepSeek says its new DeepSeekMath-V2 model got gold-medal level status on the International Mathematical Olympiad 2025 and Chinese Mathematical Olympiad 2024

Chinese startup Deepseek reports its new DeepseekMath-V2 model has reached gold medal status at the Math Olympiad …

2025-11-07
we're very close to 50% on HLE, and bonus point: it's with an open model :) [image]
2025-11-07 View on X
CNBC

Chinese startup Moonshot releases Kimi K2 Thinking, an open-weight model it claims beats GPT-5 in agentic capabilities; source: the model cost $4.6M to train

Chinese startup Moonshot on Thursday released its latest generative artificial intelligence model which claims to beat OpenAI's ChatGPT in …

> “200-300 sequential tool calls” this is really the impressive part of this release imo, can't wait to see how they did it [image]
2025-11-07 View on X
CNBC

Chinese startup Moonshot releases Kimi K2 Thinking, an open-weight model it claims beats GPT-5 in agentic capabilities; source: the model cost $4.6M to train

Chinese startup Moonshot on Thursday released its latest generative artificial intelligence model which claims to beat OpenAI's ChatGPT in …