Z.ai, formerly known as Zhipu and that has raised $1.5B from Tencent and others, releases GLM-4.5, an open-source AI model that it says is cheaper than DeepSeek
chinese models really are taking over huh Simon Willison / @simonwillison.net : Pretty decent pelicans from the new GLM-4.5 and GLM-4.5 Air models. Both models are MIT licensed, released by Chinese A...
Alibaba debuts the Qwen3-Coder model for agentic coding, including a 480B-parameter MoE variant, and open sources Qwen Code, a CLI tool adapted from Gemini CLI
Qwen 39.4k — Text Generation Transformers Safetensors qwen3_moe conversational Coco Feng / South China Morning Post : Alibaba upgrades flagship Qwen3 model to outperform OpenAI, DeepSeek in maths, c...
DeepSeek releases MIT-licensed DeepSeek-V3-0324, the latest version of their enormous DeepSeek v3 model; the previous DeepSeek v3 version had a custom license
deepseek-ai/DeepSeek-V3-0324. Chinese AI lab DeepSeek just released the latest version of their enormous DeepSeek v3 model … X: @awnihannun , @simonw , @simonw , @iterintellectus , @levie , and @data...
Mac Studio with M3 Ultra and 512GB of unified memory review: opens up new workflows on a ~$10,000 desktop, like running a quantized version of DeepSeek R1 671B
if confusing — performance magic Federico Viticci / MacStories : The M3 Ultra Mac Studio for Local LLMs Brandon Hill / Tom's Hardware : Apple Mac Studio review: M3 Ultra offers amazing performance, si...
Alibaba releases open-source reasoning model QwQ-32B on Hugging Face and ModelScope, claiming comparable performance to DeepSeek-R1 but with lower compute needs
Introduction QwQ is the reasoning model of the Qwen series. Paul Barker / InfoWorld : Alibaba says its new AI model rivals DeepSeeks's R-1, OpenAI's o1 Jose Antonio Lanz / Decrypt : Alibaba's Latest A...
Alibaba's Qwen team releases Qwen2.5-VL, a new series of AI models that can control PCs and phones, as well as perform a number of text and image analysis tasks
QWEN CHAT GITHUB HUGGING FACE MODELSCOPE DISCORD Anusuya Lahiri / Benzinga : Not Just DeepSeek - Alibaba Unveils AI Model To Rival OpenAI's Operator Markus Kasanmascheff / WinBuzzer : Alibaba Qwen Cha...
Alibaba releases QvQ-72B-Preview, an experimental research model focused on “enhancing visual reasoning capabilities”, built on Qwen2-VL-72B
QVQ-72B-Preview is an experimental research model developed by the Qwen team … QwenLM on GitHub : Qwen2-VL — Introduction After a year's relentless efforts, today we are thrilled to release Qwen2-VL...
Mac mini (2024) review: incredibly fast with the M4 Pro, well-equipped for creative work, and smaller design takes less space, but spec upgrades are overpriced
but is it right for you? Shannon Grixti / Press Start : Apple Mac Mini (2024) Review - Smaller Design With Much More Power Brian Heater / TechCrunch : iMac (M4) review: a mini upgrade to Apple's entry...
Nvidia and Mistral release Mistral NeMo, a 12B-parameter language model with a 128K-token context window, available under the Apache 2.0 open-source license
Mistral NeMo: our new best small model. A state-of-the-art 12B model … Jonathan Kemper / The Decoder : Mistral releases three new LLMs for math, code and general tasks X: Prince Canuma / @prince_canu...
Apple open sources Pkl, a configuration-as-code language with rich validation and tooling, with Swift, Go, Java, and Kotlin integration
We are delighted to announce the open source first release of Pkl (pronounced Pickle), a programming language for producing configuration. Threads: @danielpunkass . Mastodon: @lkanies@hachyderm.io . X...