tim_dettmers · TEXXR

We release SERA, the first model part of Ai2's Open Coding Agent series. SERA is a SoTA agent for its size, super simple, and 26x more efficient than RL. In my blog post, I write about my personal journey of building this coding agent: https://timdettmers.com/... Details: 👇

2026-01-28 View on X

SiliconANGLE

Ai2 launches Open Coding Agents, starting with SERA, an open-source family that includes 32B and 8B parameter models designed to adapt to private codebases

View original

We release SERA, the first model part of Ai2's Open Coding Agent series. SERA is a SoTA agent for its size, super simple, and 26x more efficient than RL. In my blog post, I write about my personal journey of building this coding agent: https://timdettmers.com/... Details: 👇

2026-01-27 View on X

SiliconANGLE

Ai2 launches Open Coding Agents, starting with SERA, an open-source family that includes 32B and 8B parameter models designed to adapt to private codebases

Artificial intelligence is moving swiftly, changing how developers craft, as code flows ever faster into repositories such as GitHub …

View original

Many people think AI will continue improve towards AGI. In my new blog post, I argue that we will not reach AGI due to physical reasons. Key items discussed: The physical reality of computation Why GPUs will no longer improve Why superintelligence is a fantasy

2025-12-11 View on X

Tim Dettmers

An Ai2 research scientist says AGI may never emerge because such a concept ignores the physical realities and limits of computation, such as energy constraints

If you are reading this, you probably have strong opinions about AGI, superintelligence, and the future of AI. X: @scaling01 , @sriramk , @tim_dettmers , and @tim_dettmers LinkedIn...

View original

My new blog post discusses the physical reality of computation and why this means we will not see AGI or any meaningful superintelligence: https://timdettmers.com/...

2025-12-11 View on X

Tim Dettmers

An Ai2 research scientist says AGI may never emerge because such a concept ignores the physical realities and limits of computation, such as energy constraints

If you are reading this, you probably have strong opinions about AGI, superintelligence, and the future of AI. X: @scaling01 , @sriramk , @tim_dettmers , and @tim_dettmers LinkedIn...

View original

I guess we are now very close to open-weights vs closed source parity. Can't test it since I am traveling and my laptop broke (😭), but many people say it's better than Sonnet/Gemini/Grok. Very exciting times!

2025-11-07 View on X

CNBC

Chinese startup Moonshot releases Kimi K2 Thinking, an open-weight model it claims beats GPT-5 in agentic capabilities; source: the model cost $4.6M to train

Chinese startup Moonshot on Thursday released its latest generative artificial intelligence model which claims to beat OpenAI's ChatGPT in …

View original

Beating DeepSeek-V3 with a 405B Llama base is not easy — solid post-training goes a long way. The nice thing is that it is fully open-source, so anyone can use this recipe for their base models.

2025-01-31 View on X

TechCrunch

The Allen Institute for AI releases Tulu 3 405B, an open source model that it claims outperforms DeepSeek V3 and OpenAI's GPT-4o on certain benchmarks

Move over, DeepSeek. There's a new AI champion in town — and they're American. — On Thursday, Ai2, a nonprofit AI research institute based …

View original

Reading the report, this is such clean engineering under resource constraints. The DeepSeek team directly engineered solutions to known problems under hardware constraints. All of this looks so elegant — no fancy “academic” solutions, just pure, solid engineering. Respect 👏

2024-12-27 View on X

VentureBeat

DeepSeek releases DeepSeek-V3, an open-source MoE model of 671B total parameters, with 37B activated per token, claiming it outperforms top models like GPT-4o

Chinese AI startup DeepSeek, known for challenging leading AI vendors with its innovative open-source technologies, today released a new ultra-large model: DeepSeek-V3.

View original

Open-source models beating closed models will become more and more common. Scaling has diminishing returns. The best solution will not have the largest scale but best approach or data. Especially with test-time compute, you do not need the best model to have the best solution.

2024-09-25 View on X

Wired

The Allen Institute for AI debuts Multimodal Open Language Model in 1B- to 72B-parameter sizes, the most capable open-source AI model with visual abilities yet

A compact and fully open source visual AI model will make it easier for AI to take control of your computer—hopefully in a good way.

View original

Looks like we got project Strawberry/Orion/Q* a bit earlier than expected 😂 Actions speak louder than words. Who is gonna pay $2k/month now?

2024-09-07 View on X

VentureBeat

HyperWrite CEO unveils Reflection 70B, based on Llama 3.1 70B Instruct and trained using reflection-tuning, and says it beats GPT-4o in all benchmarks tested

There's a new king in town: Matt Shumer, co-founder and CEO of AI writing startup HyperWrite, today unveiled Reflection 70B …

View original