Yes AI makes mistakes, so do humans very often.
I've studied electronic engineering and then switched to software engineering as a career, and I can say the only time I've been exposed to math puzzles were in academic settings. The knowledge is nice and help with certain problem solving, but you can be pretty sure I will reach out to a textbook and a calculator before trying to brute-force one such puzzle.
The most important thing in my daily life is understand the given task, do it correctly, and report about what I've done.
Puzzle solving is only for when information are not available (reverse engineering, closed systems,...) but there's a lot of information out there for the majority of tasks. I'm amazed when people spend hours trying to vibe code something, where they could spend just a few minutes reading about the system and comes up with a working solution (or find something that already works).
> AI can solve math puzzles better than 99.9% of population
So can a calculator.
While I do see this argument made quite frequently, doesn't any professional effort center in procedures employed particularly to avoid mistakes? Isn't this really the point of professional work (including professional liabilities)?