2025-05-01
TechCrunch
7 related
A study from Cohere, Stanford, MIT, and Ai2 accuses LMArena of helping Meta, OpenAI, Google, and Amazon game its popular crowdsourced AI benchmark Chatbot Arena
A new paper from AI lab Cohere, Stanford, MIT, and Ai2 accuses LM Arena, the organization behind the popular crowdsourced AI …
2025-04-18
Bloomberg
6 related
LMArena says it's starting a company, whose corporate name will be Arena Intelligence, with plans to raise money, and releases a new beta version of its website
fixing errors/bugs, improving our UI layout, and more. To keep supporting the development and continual improvement of this platform, we're also forming a company. Future improvements will continue ...
2024-09-08
TechCrunch
A look at LMSYS' Chatbot Arena and the issues surrounding the crowdsourced LLM benchmark platform, including biases, lack of transparency, and commercial ties
Kyle Wiggers / TechCrunch : X: @woojinrad X: Woojin Kim / @woojinrad : The AI industry is obsessed with Chatbot Arena, but it might not be the best benchmark | @TechCrunch Human raters bring their bi...
2024-03-28
Ars Technica
21 related
Anthropic's Claude 3 Opus surpassed OpenAI's GPT-4 on Chatbot Arena, a crowdsourced LLM leaderboard used by AI researchers; GPT-4 has been first since launch
Anthropic's Claude 3 is first to unseat GPT-4 since launch of Chatbot Arena in May '23. — On Tuesday, Anthropic's Claude 3 …
Loading articles...