Cursor recently experimented with using hundreds of AI agents to build a web browser; they ran for close to a week, writing 1M+ lines of code across 1,000 files
Scaling long-running autonomous coding. Wilson Lin at Cursor has been doing some experiments to see how far you can push a large fleet of “autonomous” coding agents:
Chinese startup Moonshot releases Kimi K2 Thinking, an open-weight model it claims beats GPT-5 in agentic capabilities; source: the model cost $4.6M to train
Chinese startup Moonshot on Thursday released its latest generative artificial intelligence model which claims to beat OpenAI's ChatGPT in …
Court docs: in a deposition, Ilya Sutskever discussed conflicts at OpenAI that he sent to board members before Sam Altman's firing, his OpenAI exit, and more
Anthropic initially expressed “excitement” about a possible merger with OpenAI two years ago, after OpenAI's board fired CEO Sam Altman …
OpenAI says it paused Sora's ability to generate videos resembling MLK Jr. at the request of his estate, after some users created “disrespectful depictions”
OpenAI responded, but is it enough? Mary Cunningham / CBS News : OpenAI blocks Sora 2 users from using MLK Jr.'s likeness after “disrespectful depictions” Katrina Morgan / WUSA : O...
Inside the discussions between OpenAI and talent agencies about the video app Sora; some agents say studios have been too reluctant to challenge tech giants
OpenAI's CEO brazenly regurgitated major studios' characters to allow video app Sora 2 to spit out clips tailor-made for users.
OpenAI says it paused Sora's ability to generate videos resembling MLK Jr. at the request of his estate, after some users created “disrespectful depictions”
OpenAI announced Thursday it paused the ability for users to generate videos resembling the late civil rights activist …
xAI introduces Grok 4, trained on its Colossus supercomputer, with multimodal features, faster reasoning, Grok 4 Voice, Grok 4 Code, a new interface, and more
Deeper thinking and greater reasoning is promised — An hour after the live stream was supposed to start last night (July 9) …
Artificial Analysis benchmarks: Grok 4 is now the leading AI model, a first for xAI; Grok 4's per-token pricing is more expensive than Gemini 2.5 Pro's and o3's
xAI gave us early access to Grok 4 - and the results are in. Grok 4 is now the leading AI model. We have run our full suite of benchmarks and Grok 4 achieves an Artificial Analysis...
Elon Musk addresses Grok's antisemitic replies, saying that “Grok was too compliant to user prompts” and “too eager to please and be manipulated, essentially”
Herb Scribner / Axios :
xAI introduces Grok 4, trained on its Colossus supercomputer, with multimodal features, faster reasoning, Grok 4 Voice, Grok 4 Code, a new interface, and more
Deeper thinking and greater reasoning is promised — An hour after the live stream was supposed to start last night (July 9) …
Apple researchers detail the limitations of top LLMs and large reasoning models, including on classic problems like the Tower of Hanoi, which AI solved in 1957
LLM “reasoning” is so cooked they turned my name into a verb — Quoth Josh Wolfe, well-respected venture capitalist at Lux Capital:
Meta VP of Generative AI Ahmad Al-Dahle denies a rumor that the company trained Llama 4 Maverick and Scout on test sets, saying that Meta “would never do that”
but the EU doesn't get everything Pascale Davies / Euronews : From a political shift to a more powerful AI: Everything to know about Meta's Llama 4 models Jay Bonggolto / Android C...
LMArena says it is updating its leaderboard policies after a Llama 4 Maverick version, which Meta said in fine print is not public, secured the number two spot
With Llama 4, Meta fudged benchmarks to appear as though its new AI model is better than the competition.
Memo: Shopify CEO Tobi Lütke says using AI is now a “fundamental expectation” and that teams asking for more resources must first show why AI can't do the job
Shopify CEO Tobi Lutke is changing his company's approach to hiring in the age of artificial intelligence.
Meta launches Llama 4 Maverick with 400B parameters and Scout with 109B parameters and a 10M context window, and previews Behemoth with 2T total parameters
Takeaways — We're sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.
The hype around AI agent Manus doesn't represent a second DeepSeek moment, but reveals that Chinese startups can compete with US companies building AI products
The viral AI agent from a Chinese startup isn't about research breakthroughs, it's about creating competitive consumer products.
OpenAI launches o3-mini, its latest reasoning model that the company says is largely on par with o1 and o1-mini in capabilities, but runs faster and costs less
OpenAI on Friday launched a new AI “reasoning” model, o3-mini, the newest in the company's o family of reasoning models.
Rather than weakening China's AI capabilities, US sanctions appear to be driving startups like DeepSeek to innovate by prioritizing efficiency and collaboration
The AI community is abuzz over DeepSeek R1, a new open-source reasoning model. — The model was developed by the Chinese AI startup DeepSeek …
Yann LeCun says DeepSeek “profited from open research and open source” like Meta's Llama and is proof that open source models are surpassing proprietary ones
“Marc Andreessen, a co-inventor of the pioneering Mosaic web browser, co-founder of the Netscape browser company and current general partner at the famed Andreessen Horowitz (a16z)...
Yann LeCun says DeepSeek “profited from open research and open source” like Meta's Llama and is proof that open source models are surpassing proprietary ones
If you hadn't heard, there's a new AI star in town: DeepSeek, the subsidiary of Hong Kong-based quantitative analysis …