I'm not going to pass any judgement on what is reasonable or unreasonable for LLMs, just pointing out that it makes serious mistakes.
Many people here argue that LLMs are able to reason, you can check my post history 3 months ago to see an example of this.
And if you have to double check everything, how much time are you really saving? And is this thing really on track to replace programmers any time soon? Absolutely not.