Researchers say GPT 4.1, Claude 3.7 Sonnet, Gemini 2.5 Pro, and Grok 3 can reproduce long excerpts from books they were trained on when strategically prompted
On tuesday, researchers at Stanford and Yale revealed something that AI companies would prefer to keep hidden.
Researchers say GPT 4.1, Claude 3.7 Sonnet, Gemini 2.5 Pro, and Grok 3 can reproduce long excerpts from books they were trained on when strategically prompted
On tuesday, researchers at Stanford and Yale revealed something that AI companies would prefer to keep hidden.
A comparison of GPT-4o, Claude 3.7 Sonnet, Gemini 2.0 Flash, Llama 4, and Copilot: Claude won overall, having the most consistent answers and no hallucinations
We challenged AI helpers to decode legal contracts, simplify medical research, speed-read a novel and make sense of Trump speeches. Bluesky: @emilprotalinski . X: @kylebrussell , @ryp__ , and @eilperi...
Anthropic adds web search to its API, starting at $10 per 1,000 searches, giving Claude 3.7 Sonnet, 3.5 Sonnet, and 3.5 Haiku access to up-to-date information
Anthropic is launching a new API that allows its Claude AI models to search across the web. Developers using it can build Claude-powered apps …
The Arc Prize Foundation says its new ARC-AGI-2 test stumps most AI models; humans get 60% of the questions right but GPT-4.5 and Claude 3.7 Sonnet score ~1%
[image] François Chollet / @fchollet : Unlike ARC-AGI-1, this new version is not easily brute-forced. Current top AI approaches score 0-4%. All base LLMs (GPT-4.5, Claude 3.7 Sonnet, Gemini 2, etc.)...
Anthropic adds web search to Claude 3.7 Sonnet, available now in preview for paid US Claude users, with support for free users and more countries coming soon
Anthropic's AI-powered chatbot, Claude, can now search the web — a capability that had long eluded it.
Anthropic releases Claude 3.7 Sonnet, a hybrid model that can produce fast responses or extended, step-by-step thinking, and Claude Code, an agentic coding tool
and it could be a game changer Ghacks : Anthropic Unveils Claude 3.7: First Hybrid Reasoning AI Model Rowan Cheung / The Rundown AI : Claude enters the reasoning era Siddharth Jindal / Analytics India...