2025-05-01
This would be the Scandal of the Century if it happened in the chess world! Imagine a player secretly playing multiple matches against an opponent but reporting only the score from the best match to max Elo rating gain — arxiv.org/abs/2504.20879 — Via @randomwalker.bsky.social @garymarcus.bsky.social [images]
TechCrunch
A study from Cohere, Stanford, MIT, and Ai2 accuses LMArena of helping Meta, OpenAI, Google, and Amazon game its popular crowdsourced AI benchmark Chatbot Arena
A new paper from AI lab Cohere, Stanford, MIT, and Ai2 accuses LM Arena, the organization behind the popular crowdsourced AI …