2026-01-02
DeepSeek drops new paper: mHC (Manifold-Constrained Hyper-Connections). According to a DeepSeek researcher, “the two biggest architectural innovations in 2025 are 1) Muon and 2) Hyper-Connections.” So what does mHC bring to the table? Residual connections have been the backbone [image]
South China Morning Post
DeepSeek researchers detail mHC, a new architecture they used to train 3B, 9B, and 27B models, finding it scaled without adding significant computational burden
DeepSeek has published a technical paper co-authored by founder Liang Wenfeng proposing a rethink of its core deep learning architecture
2026-01-01
DeepSeek drops new paper: mHC (Manifold-Constrained Hyper-Connections). According to a DeepSeek researcher, “the two biggest architectural innovations in 2025 are 1) Muon and 2) Hyper-Connections.” So what does mHC bring to the table? Residual connections have been the backbone [image]
South China Morning Post
DeepSeek researchers detail a new mHC architecture they used to train 3B, 9B, and 27B models, finding it scaled without adding significant computational burden
DeepSeek has published a technical paper co-authored by founder Liang Wenfeng proposing a rethink of its core deep learning architecture