story
Based on what you've said, it sounds like your take is that unless we can specify the exact mechanism by which LLMs understand, we have no business saying that they understand. In a lot of cases, this is a reasonable approach. In many areas, if someone tells you X, and you ask for a mechanism of action, and they can't produce one, you have solid grounds for thinking they're bullshitting.
But this case isn't quite the same. We know that LLMs learn to represent their inputs in a high-dimensional vector space (embeddings) and learn the relationships between those vectors. We also see them effectively solve problems in a variety of domains using this representation. I think these two ingredients: having a semantic representation and being able to effectively solve problems amount to something like "understanding." The lack of both properties is why I'd say Markov chains and autocomplete tools don't "understand" -- they haven't learned an effective representation of the underlying phenomena. (I'd also argue this is similar to us as humans. We don't have a good understanding of the human brain or precise mechanisms of action underlying thought. All we know is we as humans have semantic representations and can effectively solve problems.)
small note on your chess point: it now looks like chat gpt 3.5 can achieve draws against stockfish 8: https://marginalrevolution.com/marginalrevolution/2023/06/th...
bigger note on your chess point: this example illustrates that LLMs are "semi-decidable." We thought they were bad at chess, but we just hadn't discovered the right way to prompt. More generally, we can confirm when an LLM is good at X because we feed it a prompt that produces performance in X, but given the size of the input space we're dealing with here, we can't confirm that LLMs are bad at X just because we haven't seen them do well at it. Maybe we just haven't discovered the right prompt. (These input spaces are massive, by the way. ChatGPT-3.5, for example, has a context window of 4,096 tokens, so if we were considering only the English alphabet, we're looking at more than 26^{4,096} possible inputs.)