2025-04-08
The linked post is not true. There are indeed issues with Llama 4, from both the partner side (inference partners barely had time to prep. We sent out a few transformers wheels/vllm wheels mere days before release) and the model side. But there was no such training on test set.
TechCrunch
Meta VP of Generative AI Ahmad Al-Dahle denies a rumor that the company trained Llama 4 Maverick and Scout on test sets, saying that Meta “would never do that”
but the EU doesn't get everything Pascale Davies / Euronews : From a political shift to a more powerful AI: Everything to know about Meta's Llama 4 models Jay Bonggolto / Android C...
The linked post is not true. There are indeed issues with Llama 4, from both the partner side (inference partners barely had time to prep. We sent out a few transformers wheels/vllm wheels mere days before release) and the model side. But there was no such training on test set.
The Verge
LMArena says it is updating its leaderboard policies after a Llama 4 Maverick version, which Meta said in fine print is not public, secured the number two spot
With Llama 4, Meta fudged benchmarks to appear as though its new AI model is better than the competition.