Ironically "I have a magic AI test but nobody is allowed to use it" is a lot closer to the Yuri Geller situation. Tests are meant to be taken, that should be clear. And...maybe this does not apply in the academic domain, but to some extent if you cheat on an AI test "you're only cheating yourself."
And end users and developers and the general public too...
But here is the thing, I feel that even if its rote memorizing why GPT4o couldn't perform just as well on ArcAGI 1 on it or did the "reasoning" help in any way?