ai evaluations (Entity)

Coverage Timeline

2024-12-26

Time 1 related

A look at the more challenging AI evaluations emerging in response to the rapid progress of models, including FrontierMath, Humanity's Last Exam, and RE-Bench

more interesting than it sounds! LinkedIn: Ross Dawson : The frontier of “evals”. Evaluations comparing AI ahd human capabilities are evolving rapidly as AI rapidly leaves existing benchmarks in the ...

2024-12-26 View

2024-12-25

Time

A look at the more challenging AI evaluations emerging in response to the rapid progress of models, including FrontierMath, Humanity's Last Exam, and RE-Bench

Despite their expertise, AI developers don't always know what their most advanced systems are capable of—at least, not at first. X: @tharin_p and @tharin_p X: @tharin_p : My latest piece for @TIME con...

2024-12-25 View

Loading articles...

ai evaluations

Top Voices

Explore Further

Coverage Timeline

A look at the more challenging AI evaluations emerging in response to the rapid progress of models, including FrontierMath, Humanity's Last Exam, and RE-Bench

A look at the more challenging AI evaluations emerging in response to the rapid progress of models, including FrontierMath, Humanity's Last Exam, and RE-Bench

Quarterly Coverage

Top Sources

Narrative

Relationships