mascobot · TEXXR

This is a great read from Eric. I feel the same way in many ways, and it resonates a lot with the conversations with researcher friends I've had over the last few months. The world has changed a lot since 2022 (since the release of ChatGPT), and even more in the last year. If

2026-02-07 View on X

Evjang.com

A look at the state of AI agents, the evolution of thinking models, the staggering need for inference compute in the coming years, automated research, and more

— Dr. Vannevar Bush, As We May Think, 1945 — If we consider life to be a sort of open-ended MMO, the game server has just received a major update.

View original

Tinker is now available to everyone. Basically the best ML infra to train large-scale SOTA models all available with one API abstraction:

2025-12-13 View on X

Thinking Machines Lab

Mira Murati's Thinking Machines Lab makes Tinker, its API for fine-tuning language models, generally available, adds support for Kimi K2 Thinking, and more

Tinker is a dream for multi-agent setups, Nathan Lambert / @natolambert : Please add olmo3 @johnschulman2 et al. The goal is to make it the foundational research infrastructure for...

View original

This is pretty cool and basically includes a “full stack” model training with most of the concepts there, but something you can train on a H100 node for ~$100: - Train a tokenizer (Rust implementation) - Pre-train a Transformer on FineWeb, evaluate CORE score across a number of

2025-10-14 View on X

@karpathy

Andrej Karpathy unveils nanochat, a full-stack training and inference implementation of an LLM in a single, dependency-minimal codebase, deployable in 4 hours

It provides a full ChatGPT-style LLM, including training, inference and a web Ui … X: Clem / @clementdelangue : Am I wrong in sensing a paradigm shift in AI? Feels like we're movin...

View original

Super thrilled to back @miramurati and the amazing team @thinkymachines - a GOAT team that has made major contributions to RL, pre-training/post-training, reasoning, multimodal, and of course ChatGPT! No one is better positioned to advance the frontier. @martin_casado @pmarca

2025-07-16 View on X

Reuters

Mira Murati's Thinking Machines Lab raised a $2B seed led by a16z at a $12B valuation; Nvidia, Accel, ServiceNow, Cisco, AMD, and Jane Street also invested

bsky.app/profile/wire... [embedded post] @akhilrao : i feel like i've known Murati is at a startup called Thinking Machine Labs for months. maybe idk what “stealth” means [embedde...

View original

Wow, Qwen3 235B MoE with 22B active params, beats R1, Grok, O1 and O3 mini. - 2 MoE models and 6 dense models, ranging from 0.6B to 235B. - Apache 2.0 [image]

2025-04-29 View on X

TechCrunch

Alibaba debuts its Qwen3 family of open-weight “hybrid” AI reasoning models, including Qwen3-235B-A22B, with 235B total parameters and 22B activated parameters

Chinese tech company Alibaba on Monday released Qwen3, a family of AI models the company claims matches …

View original

We @a16z couldn't be more excited to partner with Cursor @cursor_ai, @mntruell, @amanrsanger, @sualehasif996 & team. Seeing the progress and iteration of this team since the early days has been phenomenal. I have using Cursor since the very early days and I use it daily now.

2024-08-23 View on X

Financial Times

Dealroom: AI coding assistant startups such as Anysphere and Augment have raised $433M so far in 2024 alone, bringing the total since January 2023 to $906M

Software engineering attracts investors but making money from generative artificial intelligence still eludes many

View original

Amazing addition to the @AnthropicAI team. Congrats @johnschulman2 & Team!

2024-08-06 View on X

@johnschulman2

OpenAI co-founder John Schulman departs to join Anthropic and focus on AI alignment, and says “I'm not leaving due to lack of support for alignment research”

I shared the following note with my OpenAI colleagues today: I've made the difficult decision to leave OpenAI. This choice stems from my desire to deepen my focus on AI alignment, ...

View original

LLaMa 3.1 benchmarks side by side. This is truly a SOTA model. Beats GPT4 almost on every single benchmark. Continuously trained with a 128K context length. Pre-trained on 15.6T tokens (405B). The fine-tuning data includes publicly available instruction datasets, as well as [image]

2024-07-24 View on X

Mark Zuckerberg argues that “open source AI” is the path forward, closed models are vulnerable to vendor lock-in and state-backed espionage, and more

RE: https://www.threads.net/... Dare Obasanjo / @carnage4life : You can find @zuck's full post here https://www.facebook.com/... Dare Obasanjo / @carnage4life : Mark Zuckerberg has...

View original

LLaMa 3.1 benchmarks side by side. This is truly a SOTA model. Beats GPT4 almost on every single benchmark. Continuously trained with a 128K context length. Pre-trained on 15.6T tokens (405B). The fine-tuning data includes publicly available instruction datasets, as well as [image]

2024-07-24 View on X

Bloomberg

Meta debuts Llama 3.1 405B, the “first frontier-level open source AI model”, as well as new Llama 3.1 70B and 8B models, and says it's working on Llama 4

View original

The OS LLama3 is moving fast. Llama3 8B-instruct with 160K context window, done with progressive training on augmented generations of increasing context lengths of SlimPajama

2024-05-03 View on X

The Information

Some developers are releasing versions of Llama 3, which has a context window of 8K+ tokens, with longer context windows, thanks to Meta's open-source approach

View original

LLaMA2 Live Demo: [video]

2023-07-19 View on X

Meta releases Llama 2, its open-source LLM with double the context length, for free for research and commercial use, and expands its Microsoft partnership

Recent breakthroughs in AI, and generative AI in particular, have captured the public's imagination and demonstrated what those developing …

View original

✨NEW LAUNCH! LLaMA2 chat API & open-source playground💫: We're releasing tools that make it easy to test @meta's latest LLM & add it to your own app with @replicatehq. Playground: https://llama2.ai/ Live chat API here: https://replicate.com/... Repos & instructions below:

2023-07-19 View on X

Meta releases Llama 2, its open-source LLM with double the context length, for free for research and commercial use, and expands its Microsoft partnership

Recent breakthroughs in AI, and generative AI in particular, have captured the public's imagination and demonstrated what those developing …

View original

A new European AI regulation proposal would make any “American opensource developer” that hosts an “unlicensed LLMs” on GitHub & available in Europe liable for “€20,000,000 or 4% of worldwide revenue” https://technomancers.ai/... [image]

2023-05-16 View on X

Stratechery

Google's I/O 2023 suggests that AI is a sustaining innovation for Big Tech; the true fight will be between the major players' centralized models and open source

View original

A new European AI regulation proposal would make any “American opensource developer” that hosts an “unlicensed LLMs” on GitHub & available in Europe liable for “€20,000,000 or 4% of worldwide revenue” https://technomancers.ai/... [image]

2023-05-15 View on X

Stratechery

Google's AI-heavy I/O suggests AI is a sustaining innovation for Big Tech; the true fight will be between major players' centralized models and open source

Some things in tech are shocking, but not surprising — think of a CEO of a struggling company losing their job.

View original

This is a leap in image/pixel segmentation. Meta AI just released SAM (Segment Anything Model). One of the most interesting things is well understating of objects ("objectification" of parts). The model is released open source under an Apache 2.0 license, and it's only 2.4Gb.... https://twitter.com/... https://twitter.com/...

2023-04-06 View on X

SiliconANGLE

Meta releases its Segment Anything Model and Segment Anything 1-Billion mask dataset, hoping to help researchers with computer vision and object identification

and Meta is sharing the code Katie Paul / Reuters : Meta releases AI model that can identify items within images GitHub : Segment Anything — Meta AI Research, FAIR — [Paper] [P...

View original

GPT-4 benchmark chart with other SOTA models: MMLU (Multiple-choice questions): GPT-4 -> 86.4% GPT-3.5 -> 70.0% PALM -> 70.7% Flan-PALM -> 75.2% https://twitter.com/...

2023-03-15 View on X

OpenAI

OpenAI debuts GPT-4, claiming the model “surpasses ChatGPT in its advanced reasoning capabilities”, available in ChatGPT Plus and as an API that has a waitlist

Following the research path from GPT, GPT-2, and GPT-3, our deep learning approach leverages more data and more computation …

View original