The argument is not "here's one failure case, therefore they don't reason". The argument is that systematically if you given an LLM problem instances outside training sets in domains with clear structural rules, they will fail to solve them. The argument then goes that they must not have an actual model or understanding of the rules, as they seem to only be capable of solving problems in the training set. That is, they have failed to figure out how to solve novel problem instances of general problem structures using logical reasoning.
Their strict dependence on having seen the exact or extremely similar concrete instances suggests that they don't actually generalize—they just compute a probability based on known instances—which everyone knew already. The problem is we just have a lot of people claiming they are capable of more than this because they want to make a quick buck in an insane market.