Because the phrase 'language model' (or rather 'large language model', LLM) is not a post-hoc classification arrived at by some digital anthropologist examining a black box. It's a description of the tool that OpenAI set out (successfully!) to build. That you are ascribing additional properties to it is exactly the kind of thing I'm talking about - it's so convincing that it's tempting to think that it's reasoning beyond its capabilities, but it's not. Can you cite specific examples of things it's doing besides producing text? It's generally terrible at maths (as you would expect).
Without wishing to diminish the importance of this work (because it is genuinely incredible and useful in all kinds of ways), we still need to remember that under the hood it's really an elaborate parlour trick, a sort of reverse mechanical turk pretending to be a brain. More interesting I think is the question of how much of human intelligence is likewise this kind of statistical pattern matching; it seems to me increasingly that we're not as smart as we think we are.