I didn't mean to suggest that it's just a large mapping between exact inputs it's seen before and exact outputs — it's definitely complex! The size of the model allows it to infer statistical probabilities about related words even without having seen a particular sequence before.
In a sense, it's able to take an "educated" guess at what is statistically likely to be the response you're looking for given the words in a particular input, in addition to the context it's seen in your interaction. To do that, it uses what it learned about the words, their sequences, their relationships to other words, etc.
But at the end of the day, none of that means it has any "understanding" of what it's outputting. That's why there have been countless examples of it outputting very well-constructed, real-sounding descriptions of books/papers/etc that never existed — because it's really good at generating sentences that have the right "shape", but it has no way of knowing whether the contents of the sentence are actually true. It just knows that, given what it's seen in its training set (again, through a complex web of relationships), the response it generated is likely to look like something someone would have written if they were provided with the same input.