/
Navigation
C
Chronicles
Browse all articles
C
E
Explore
Semantic exploration
E
R
Research
Entity momentum
R
N
Nexus
Correlations & relationships
N
~
Story Arc
Topic evolution
S
Drift Map
Semantic trajectory animation
D
P
Posts
Analysis & commentary
P
Browse
@
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
?
Concept Search
Semantic similarity search
!
High Impact Stories
Top coverage by position
+
Sentiment Analysis
Positive/negative coverage
*
Anomaly Detection
Unusual coverage patterns
Analysis
vs
Rivalry Report
Compare two entities head-to-head
/\
Semantic Pivots
Narrative discontinuities
!!
Crisis Response
Event recovery patterns
Connected
Nav: C E R N
Search: /
Command: ⌘K
Embeddings: large
VOICE ARCHIVE

@jiqizhixin

@jiqizhixin
3 posts
2026-01-02
On the first day of the New Year, DeepSeek released a major paper. They try to fix the training instability that plagues advanced neural network designs. Enter mHC: Manifold-Constrained Hyper-Connections. They take the powerful but unstable “Hyper-Connections” architecture [image]
2026-01-02 View on X
South China Morning Post

DeepSeek researchers detail mHC, a new architecture they used to train 3B, 9B, and 27B models, finding it scaled without adding significant computational burden

DeepSeek has published a technical paper co-authored by founder Liang Wenfeng proposing a rethink of its core deep learning architecture

2026-01-01
On the first day of the New Year, DeepSeek released a major paper. They try to fix the training instability that plagues advanced neural network designs. Enter mHC: Manifold-Constrained Hyper-Connections. They take the powerful but unstable “Hyper-Connections” architecture [image]
2026-01-01 View on X
South China Morning Post

DeepSeek researchers detail a new mHC architecture they used to train 3B, 9B, and 27B models, finding it scaled without adding significant computational burden

DeepSeek has published a technical paper co-authored by founder Liang Wenfeng proposing a rethink of its core deep learning architecture

2025-10-21
Compress everything visually! DeepSeek has just released DeepSeek-OCR, a state-of-the-art OCR model with 3B parameters. Core idea: explore long-context compression via 2D optical mapping. Architecture: - DeepEncoder → compresses high-res inputs into few vision tokens; - [image]
2025-10-21 View on X
The Decoder

DeepSeek releases DeepSeek-OCR, a vision language model designed for efficient vision-text compression, enabling longer contexts with less compute

the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8....