The Center for AI Safety (CAIS) and Scale AI are inviting people to submit their questions and construct “the world’s most difficult artificial intelligence test”.
The quiz’s developers stated in a statement that “current tests have grown too easy, and we can no longer track AI developments well, or how far they are from becoming expert-level.”
Exam questions were being answered by AI almost at random a few years ago, but that is no longer the case.
As per Dan Hendrycks, executive director of CAIS, OpenAI’s most recent model, OpenAI o1, “destroyed the most popular reasoning benchmarks” last week.
AI is yet unable to provide intelligent answers to other challenging scientific issues, though.
Additionally, based on Stanford University’s AI Index Report from April, it seems to do poorly on tasks that involve planning and visual pattern recognition.
Rather, they advise question authors to be a PhD student or above, or to have five years or more of experience working in a technical sector position like SpaceX.
Trick questions should be avoided, and submissions should be “not easily answerable via a quick online search” and challenging for non-experts to complete.
“As a rule of thumb, if a randomly selected undergraduate can understand what is being asked, it is likely too easy for the frontier LLMs of today and tomorrow,” the creators of the quiz stated.