lmsysorg · TEXXR

🎉 Meet Qwen3.5-397B-A17B from @Alibaba_Qwen, 397B total params (17B active), built for real-world multimodal intelligence — day-0 support is now live in SGLang! 👁️ Unified vision-language foundation (early fusion): stronger reasoning, coding & agents ⚡ Gated DeltaNet + sparse [image]

2026-02-17 View on X

Reuters

Alibaba debuts Qwen3.5, a 397B-parameter open-weight multimodal AI model that it says is 60% cheaper to use and 8x better at large workloads than Qwen3

View original

🎉 Meet Qwen3.5-397B-A17B from @Alibaba_Qwen, 397B total params (17B active), built for real-world multimodal intelligence — day-0 support is now live in SGLang! 👁️ Unified vision-language foundation (early fusion): stronger reasoning, coding & agents ⚡ Gated DeltaNet + sparse [image]

2026-02-16 View on X

Reuters

Alibaba debuts Qwen3.5, a 397B-parameter open-weight multimodal AI model that it says is 60% cheaper to use and 8x better at large workloads than Qwen3

View original

🎉 The mysterious Pony Alpha is finally revealed, congrats to @Zai_org on releasing GLM-5! SGLang is ready to support on day-0. 🛠️ 744B params (40B active) model built for complex systems engineering & long-horizon agentic tasks 📚 28.5T tokens pretraining for a stronger [image]

2026-02-12 View on X

Z.ai

Z.ai launches GLM-5, saying its flagship open-weight model has “best-in-class performance among all open-source models” in reasoning, coding, and agentic tasks

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways …

View original

🎉 The mysterious Pony Alpha is finally revealed, congrats to @Zai_org on releasing GLM-5! SGLang is ready to support on day-0. 🛠️ 744B params (40B active) model built for complex systems engineering & long-horizon agentic tasks 📚 28.5T tokens pretraining for a stronger [image]

2026-02-12 View on X

Reuters

Z.ai says it will raise prices by at least 30% for new GLM coding plan subscribers to accommodate surging demand for its AI coding tools

View original

🎉 The mysterious Pony Alpha is finally revealed, congrats to @Zai_org on releasing GLM-5! SGLang is ready to support on day-0. 🛠️ 744B params (40B active) model built for complex systems engineering & long-horizon agentic tasks 📚 28.5T tokens pretraining for a stronger [image]

2026-02-11 View on X

Z.ai

Z.ai launches GLM-5, its flagship open-weight model, saying it has best-in-class performance among open-source models in reasoning, coding, and agentic tasks

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways …

View original

🚀 SGLang In-Depth Review of the NVIDIA DGX Spark is LIVE! Thanks to @NVIDIA 's early access program, SGLang makes its first ever appearance in a consumer product, the brand-new DGX Spark. The DGX Spark's 128GB Unified Memory and Blackwell architecture set a new standard for local AI prototyping and edge computing...

2025-10-14 View on X

PCMag

Nvidia says it will begin selling the DGX Spark mini PC, with DGX OS, for AI developers on October 15 on Nvidia.com and select third-party retailers for $3,999

(PCMag/Michael Kan) … It's not a consumer desktop, but Nvidia's foray into an AI developer-focused mini PC is finally ready to launch.

View original

Congratulations on the Amazing Doubao-1.5 models! Doubao models are fine-tuned on veRL framework and SGLang will collaborate with veRL team deeply in the next version for a much better hybrid engine! Stay tuned! [image]

2025-01-23 View on X

Reuters

ByteDance debuts Doubao-1.5-pro, claiming its latest flagship AI model outperforms OpenAI's o1 in AIME benchmarks, joining DeepSeek in China's AI reasoning push

TikTok owner ByteDance on Wednesday released an update to its flagship AI model aimed at challenging Microsoft-backed OpenAI's …

View original

The best open-source LLM, DeepSeek V3, has just been released! SGLang v0.4.1 is the officially recommended inference solution for it. The SGLang and DeepSeek teams worked together to support DeepSeek V3 FP8 on NVIDIA and AMD GPUs from day one. SGLang has supported MLA and DP

2024-12-27 View on X

VentureBeat

DeepSeek releases DeepSeek-V3, an open-source MoE model of 671B total parameters, with 37B activated per token, claiming it outperforms top models like GPT-4o

Chinese AI startup DeepSeek, known for challenging leading AI vendors with its innovative open-source technologies, today released a new ultra-large model: DeepSeek-V3.

View original

Woah, another exciting update from Chatbot Arena❤️‍🔥 The results for @xAI's sus-column-r (Grok 2 early version) are now public**! With over 12,000 community votes, sus-column-r has secured the #3 spot on the overall leaderboard, even matching GPT-4o! It excels in Coding (#2), [image]

2024-08-14 View on X

TechCrunch

xAI launches Grok-2 and Grok-2 mini in beta with improved reasoning, available to X Premium and Premium+ users, and plans to add them to its API later in August

Elon Musk-owned launched Grok-2 and Grok-2 mini in beta today with improved reasoning. The new Grok AI model can now generate images …

View original

Congrats @GoogleDeepMind on the Gemma-2-2B release! Gemma-2-2B has been tested in the Arena under “guava-chatbot”. With just 2B parameters, it achieves an impressive score 1130 on par with models 10x its size! (For reference: GPT-3.5-Turbo-0613: 1117, Mixtral-8x7b: 1114). This [image]

2024-08-01 View on X

The Decoder

Google unveils updates to its Gemma family of open models, including Gemma 2 2B, which it claims surpasses GPT-3.5 and Mixtral 8x7B on the LMSYS Chatbot Arena

Google has unveiled updates to its Gemma 2 family of open-source language models, focusing on improved performance, safety, and transparency.

View original

Congrats @GoogleDeepMind on the new Gemma-2 27B & 9B release! Gemma-2 was tested in the Arena under the codename “*late-june-chatbots” and now out of stealth. Its early result matches the best open models (Llama-3-70B, Nemotron-340B) with only 27B parameters! Impressively, [image]

2024-06-28 View on X

VentureBeat

Google says Gemma 2 will be available to researchers and developers through Vertex AI from July as a new 9B-parameter model and the prior 27B-parameter model

View original

Congrats @nvidia on the exciting 340B model release! The model was tested under the codename “june-chatbot” and is now coming out of stealth with impressive performance, surpassing Llama-3-70b across hard benchmarks like Arena-Hard-Auto. The new best open model? Come play with [image]

2024-06-16 View on X

NVIDIA Blog

Nvidia announces Nemotron-4 340B, a family of models that developers can use to generate synthetic data for training LLMs for commercial applications

Nemotron-4 340B, a family of models optimized for NVIDIA NeMo and NVIDIA TensorRT-LLM, includes cutting-edge instruct and reward models, and a dataset for generative AI training.

View original

Congrats @nvidia on the exciting 340B model release! The model was tested under the codename “june-chatbot” and is now coming out of stealth with impressive performance, surpassing Llama-3-70b across hard benchmarks like Arena-Hard-Auto. The new best open model? Come play with [image]

2024-06-15 View on X

NVIDIA Blog

Nvidia announces Nemotron-4 340B, a family of models that developers can use to generate synthetic data for training LLMs for commercial applications

Nemotron-4 340B, a family of models optimized for NVIDIA NeMo and NVIDIA TensorRT-LLM, includes cutting-edge instruct and reward models, and a dataset for generative AI training.

View original

@simonw hi @simonw, thanks a ton! We really value your feedback. Just to clarify, following our policy, we've partnered with several model developers to bring their new models to our platform for community preview testing. These models are strictly for testing and won't be listed on the...

2024-05-01 View on X

Gizmodo

A look at gpt2-chatbot, a mysterious AI chatbot that became available on LLM benchmarking site LMSYS Org and is believed to have similar capabilities as GPT-4

An advanced AI model with unknown origins turned heads this week as online communities unpacked a cryptic tweet from Sam Altman.

View original

Thanks for the incredible enthusiasm from our community! We really didn't see this coming. Just a couple of things to clear up: - In line with our policy, we've worked with several model developers in the past to offer community access to unreleased models/checkpoints (e.g.,...

2024-05-01 View on X

Gizmodo

A look at gpt2-chatbot, a mysterious AI chatbot that became available on LLM benchmarking site LMSYS Org and is believed to have similar capabilities as GPT-4

An advanced AI model with unknown origins turned heads this week as online communities unpacked a cryptic tweet from Sam Altman.

View original

Links & plots: - Vote @ https://chat.lmsys.org/ - Leaderboard https://huggingface.co/... - CI on model strength [image]

2024-03-28 View on X

Ars Technica

Anthropic's Claude 3 Opus surpassed OpenAI's GPT-4 on Chatbot Arena, a crowdsourced LLM leaderboard used by AI researchers; GPT-4 has been first since launch

Anthropic's Claude 3 is first to unseat GPT-4 since launch of Chatbot Arena in May '23. — On Tuesday, Anthropic's Claude 3 …

View original

[Arena Update] 70K+ new Arena votes🗳️ are in! Claude-3 Haiku has impressed all, even reaching GPT-4 level by our user preference! Its speed, capabilities & context length are unmatched now in the market🔥 Congrats @AnthropicAI on the incredible Claude-3 launch! More exciting... [image]

2024-03-28 View on X

Ars Technica

Anthropic's Claude 3 Opus surpassed OpenAI's GPT-4 on Chatbot Arena, a crowdsourced LLM leaderboard used by AI researchers; GPT-4 has been first since launch

Anthropic's Claude 3 is first to unseat GPT-4 since launch of Chatbot Arena in May '23. — On Tuesday, Anthropic's Claude 3 …

View original