The discussion started "what would it take to convince people that [insert favourite LLM] is good at maths", and the response to that IMHO is that we have much better tools to do arithmetic (I don't even want to say maths), even if humans themselves are also poor at arithmetic.
What's the point of building a system to be equally bad as humans at something that we know humans are bad at? LLMs have their uses but (at least at the current stage) performing arithmetic calculations is not one of them (to say nothing of more advanced mathematics).