I should've said that the model is "missing", not "weak" when talking about LLMs, that was my mistake. Yes I'm a human with an imperfect and in many aspects incorrect conceptual model of the world, that is true. The following aren't real examples, they're hyperbolic to better illustrate the category of errors I'm talking about.
If someone asks me "can I stare into the sun without eye protection", my answer isn't going to change based on how the question is phrased because I conceptually understand that the radiation coming from the sun (and more broadly, intense visible radiation emitted from any source) causes irreversible damage to your eyes, which is a fact stored in my conceptual understanding of the world.
However LLMs will flip flop based on tone and phrasing of your question. Asked normally, they will warn you about the dangers of staring into the sun, but if your question hints at disbelief, they might reply "No you're right, staring into the sun isn't that bad".
I also know that mirrors reflect light, which allows me to intuitively understand that staring at the sun through a mirror is dangerous without being explicitly taught that fact.
If you ask an LLM whether staring into a mirror which is pointed at the sun (oriented such that you see the sun through the mirror) is safe, they might agree that it's safe to do so, even though they "know" that staring into the sun is dangerous, and they "know" that mirrors reflect light. Presumably this is because their training data doesn't explicitly state that staring at a mirror is dangerous.
The way the question is framed can completely change their answer which betrays their lack of conceptual understanding. Those are distinctly different problems. You might say that humans do this too, but we don't call that intelligent behavior, and we tend to have a low opinion of those who exhibit this behavior often.