That's in no way different than claiming that LLMs understand language, or reason, etc, because they were designed that way.
Neural nets of all sorts have been beating benchmarks since forever, e.g. there's a ton of language understanding benchmarks pretty much all saturated by now (GLUE, SUPERGLUE ULTRASUPERAWESOMEGLUE ... OK I made that last one up) but passing them means nothing about the ability of neural net-based systems to understand language, regardless of how much their authors designed them to test language understanding.
Failing a benchmark also doesn't mean anything. A few years ago, at the first Kaggle competition, the entries were ad-hoc and amateurish. The first time a well-resourced team tried ARC (OpenAI) they ran roughshod over it and now you have to make a new one.
At some point you have to face the music: ARC is just another benchmark, destined to be beat in good time whenever anyone makes a concentrated effort at it and still prove nothing about intelligence, natural or artificial.