Thoughts on AI progress and why AI labs' actions hint at a worldview in which AI models will continue to fare poorly at generalization and on-the-job learning
Why I'm moderately bearish in the short term, and explosively bullish in the long term — What are we scaling? X: @sriramk , @_simonsmith , @dwarkesh_sp , @emollick , @dwarkesh_sp , @dwarkesh_sp , @m...
GPT-5 hands-on: it exudes competence but doesn't feel like a dramatic leap ahead of other LLMs, and the pricing is aggressively competitive with other providers
And It Changes Everything Tyler Cowen / Marginal Revolution : GPT-5, a short and enthusiastic review GPT-5 : GPT-5 — Our hands-on review of OpenAI's newest model based on weeks of testing — The Ve...
Google unveils benchmarking platform Kaggle Game Arena, where LLMs compete head-to-head in strategic games, starting with a chess tournament from August 5 to 7
Watch models compete in complex games providing a verifiable and dynamic measure of their capabilities. Kaggle : Chess Text Input Leaderboard Nick Bild / Hackster : Shall We Play a Game? Maximilian Sc...
OpenAI unveils o3 and o3-mini, trained to “think” before responding via what OpenAI calls a “private chain of thought”, and plans to launch them in early 2025
12 Days of OpenAI: Day 12 Naomi Li Gan / Tech in Asia : OpenAI unveils AI model for advanced reasoning Bojan Stojkovski / Interesting Engineering : OpenAI unveils o3 reasoning AI model to tackle compl...
OpenAI launches GPT-4-powered ChatGPT Enterprise, with improved privacy, performance, and data analysis features; pricing is “dependent on each company's usage”
but is it playing catch-up? X: Andrej Karpathy / @karpathy : Imo the productivity amplification here is so large that organizations should be thinking about it as a basic work tool, like a new kind of...