hey_zio · TEXXR

2025-08-08

Crazy that GPT-5 is only 0.4% better than Opus 4.1 on SWE bench Feels like Anthropic will pass them again with their bigger updates in a few weeks. Next few days of real-world usage will show if it's actually better than the current Claude models. [image]

2025-08-08 View on X

VentureBeat

OpenAI touts GPT-5's scores on math, coding, and health benchmarks: 94.6% on AIME 2025 without tools, 74.9% on SWE-bench Verified, and 46.2% on HealthBench Hard

After literally years of hype and speculation, OpenAI has officially launched a new lineup of large language models (LLMs) …

View original

2025-08-08 View on X

TechCrunch

OpenAI says GPT-5 is a unified system with an efficient model for most questions, a reasoning model for harder problems, and a router that decides which to use

All You Need To Know Lakshay Kumar / Business Today : What is GPT-5? How OpenAI is upgrading your ChatGPT experience Tsveta Ermenkova / PhoneArena : You can now chat with a PhD-lev...

View original