Claude Opus 4.5 (Person)

@metr_evals 4 related

METR: Claude Opus 4.5 has a 50% task completion time horizon of about 4 hours and 49 minutes, more than double that of Claude Opus 4 released earlier this year

just careful, meticulous rigor. Nikola Jurkovic / @nikolaj2030 : This result updates me towards 4 month doubling times being my median estimate for the next two years. That means by EOY 2026 the time ...

2025-12-22 View

@metr_evals 3 related

METR: Claude Opus 4.5 has a 50% task completion time horizon of about 4 hours and 49 minutes, more than double that of Claude Opus 4 released earlier this year

We estimate that, on our tasks, Claude Opus 4.5 has a 50%-time horizon of around 4 hrs 49 mins (95% confidence interval of 1 hr 49 mins to 20 hrs 25 mins). While we're still working through evaluation...

2025-12-21 View

The Keyword 4 related

Google says Gemini 3 Pro sets new vision AI benchmark records, including in complex visual reasoning, beating Claude Opus 4.5 and GPT-5.1 in some categories

Raising Concerns for Real-World Use Will McCurdy / PCMag : ChatGPT Overtakes Amazon, X, Reddit, WhatsApp, and Wikipedia in Visitors X: Demis Hassabis / @demishassabis : Gemini has always had exception...

2025-12-08 View

The Information

Sources: OpenAI is developing a new LLM, codenamed Garlic, that outperforms Gemini 3 and Claude Opus 4.5 in coding and reasoning tasks, per internal evaluations

OpenAI, which in recent weeks has appeared to fall behind Google in AI development, is fighting back with a new large language model codenamed Garlic. X: @amir X: Amir Efrati / @amir : new: OpenAI dev...

2025-12-02 View

Anthropic 8 related

Study: using the SCONE-bench benchmark of 405 blockchain smart contracts, Claude Opus 4.5, Claude Sonnet 4.5, and GPT-5 developed exploits together worth $4.6M

AI models are increasingly good at cyber tasks, as we've written about before. But what is the economic impact of these capabilities?

2025-12-02 View

Simon Willison's Weblog 2 related

Anthropic prices Claude Opus 4.5 at $5/1M input and $25/1M output tokens, much cheaper than Opus 4.1 at $15/$75 but still pricier than GPT-5.1 and Gemini 3 Pro

Opus 4.5 was responsible for most of the work across 20 commits, 39 files changed, 2,022 additions and 1,173 deletions in a two day period. … Forums: r/BetterOffline : Claude Opus 4.5, and why evaluat...

2025-11-25 View

Anthropic 39 related

Anthropic launches Claude Opus 4.5, saying it is “the best model in the world for coding, agents, and computer use” and “meaningfully better at everyday tasks”

Our newest model, Claude Opus 4.5, is available today. It's intelligent, efficient …

2025-11-25 View

Claude Opus 4.5

Related Entities

Top Voices

Explore Further

Coverage Timeline

METR: Claude Opus 4.5 has a 50% task completion time horizon of about 4 hours and 49 minutes, more than double that of Claude Opus 4 released earlier this year

METR: Claude Opus 4.5 has a 50% task completion time horizon of about 4 hours and 49 minutes, more than double that of Claude Opus 4 released earlier this year

Google says Gemini 3 Pro sets new vision AI benchmark records, including in complex visual reasoning, beating Claude Opus 4.5 and GPT-5.1 in some categories

Sources: OpenAI is developing a new LLM, codenamed Garlic, that outperforms Gemini 3 and Claude Opus 4.5 in coding and reasoning tasks, per internal evaluations

Study: using the SCONE-bench benchmark of 405 blockchain smart contracts, Claude Opus 4.5, Claude Sonnet 4.5, and GPT-5 developed exploits together worth $4.6M

Anthropic prices Claude Opus 4.5 at $5/1M input and $25/1M output tokens, much cheaper than Opus 4.1 at $15/$75 but still pricier than GPT-5.1 and Gemini 3 Pro

Anthropic launches Claude Opus 4.5, saying it is “the best model in the world for coding, agents, and computer use” and “meaningfully better at everyday tasks”

Quarterly Coverage

Top Sources

Narrative

Relationships