Take the sentence 'I fooed a bar with a Baz' - can you infer what I did from this?
How do you define meaning? If we can find a mapping between the sequence of words in a language and the underlying structure of the world, then we by definition know what those words mean. The question then reduces to whether there will be multiple such plausible mappings once we have completely captured the regularity of a "very large" sequence of natural language text. I strongly suspect the answer is no, there will only be one or a roughly equivalent class of mappings such that we can be confident in the discovered associations between words and concepts.
The number of relationships (think graph edge) between things-in-the-world is "very large". The set of possible relationships between entities is exponential in the size of the number of entities. But the structure in a natural language isn't arbitrary, it maps to these real world relationships in natural ways. So once we capture all the statistical regularities, there should be some "innocent" mapping between these regularities and things-in-the-world. "Innocent" here meaning a relatively insignificant amount of computation went into finding the mapping (relative to the sample space of the input/output).
>Take the sentence 'I fooed a bar with a Baz' - can you infer what I did from this?
Write me a billion pages of text while using foo bar baz and all other words consistently throughout, and I could probably tell you.
Here's a good example. Suppose I had a huge set of recipe books from a human culture - just recipes, no other record of a culture.
I might be able to get as far as XYZZY meaning "a food that can be sliced, mashed, fried and diced" but how would I really tell if XYZZY means carrot, potato, or tomato, or tuna?
This seems a bit tautological to me: if being able to make certain mappings is understanding, then does this not amount to "once we understand something, understanding it is a solved problem?"
On the other hand, the apparently simplistic mappings used by these language models have achieved way more than I would have thought, so I am somewhat primed to accept that understanding turns out to be no more mysterious than qualia.
I doubt that just any mapping will do. One aspect of human understanding that still seems to be difficult for these models is reasoning about causality and motives.
I think it is a fairly common intuition that one cannot understand something just by rote-learning a bunch of facts.
I meant it in the sense of: given any reasonable definition of "understanding", finding a mapping between the sequences of words and the structure of the world must satisfy the definition.
>I think it is a fairly common intuition that one cannot understand something just by rote-learning a bunch of facts.
I agree, but its important to understand why this is. The issue is that learning an assignment between some words and some objects misses the underlying structure that is critical to understanding. For example, one can point out the names of birds but know nothing about them. It is once you can also point out details about their anatomy, how they interact with their environment, find food, etc and then do some basic reasoning using these bird facts that we might say you understand a lot about birds.
The assumption underlying the power of these language models is that a large enough text corpus will contain all these bird facts, perhaps indirectly through being deployed in conversation. If it can learn all these details, deploy them correctly within context, and even do rudimentary reasoning using such facts (there are examples of GPT-3 doing this), then it is reasonable to say that the language model captures understanding to some degree.
I think the biggest problem with this argument is the assumption that '[the structure in a natural language] maps to these real world relationships in natural ways'. One thing we know for sure is that human language maps to internal concepts of the human mind, and that it doesn't map directly to the real world at all. This is not necessarily a barrier to translation between human languages, but I think it makes the applicability of this idea to translations between human and alien languages almost certainly null.
Perhaps the most obvious aspect of this is any word related directly to the internal world - emotions, perceptions (colors, tastes, textures etc) - there is no hope of translating these between organisms with different biologies.
However, essentially any human word, at least outside the sciences, falls in this category. At the most basic level, what you perceive as an object is a somewhat arbitrary modeling of the world specific to our biology and our size and time scale. To a being that perceived time much slower than us, many things that we see as static and solid may appear as more liquid and blurry. A significantly smaller or larger creature may see or miss many details of the human world and thus be unable to comprehend some of our concepts.
Another obstacle is that many objects are defined exclusively in terms of their uses in human culture and customs - there is no way to tell the difference between a sword, a scalpel, a knife, a machete etc unless you have an understanding of many particulars of some specific human society. Even the concept of 'cutting object' is dependent on some human-specific perceptions - for example, we perceive a knife as cutting bread, but we don't perceive a spoon as cutting the water when we take a spoonful from a soup, though it is also an object with a thin metal edge separating a mass of one substance into two separate masses (coincidentally, also doing so for consumption).
And finally, even the way we conceive mathematics may be strongly related to our biology (given that virtually all human beings are capable of learning at least arithmetic, and not a single animal is able to learn even counting), possibly also related to the structure of our language. Perhaps an alien mind has come up with a completely different approach to mathematics that we can't even fathom (though there would certainly be an isomorphism between their formulation of maths and ours, neither of our species may be capable of finding it).
And finally, there are simply so many words and concepts that are related to specific organisms in our natural environment, that you simply can't translate without some amount of firsthand experience. I could talk about the texture of silk for a long while, and you may be able to understand roughly what I'm describing, but you certainly won't be able to understand exactly what a silkworm is unless you've perceived one directly in some way that is specific to your species, even though you probably could understand I'm talking about some kind of other life form, it's rough size and some other details.
I disagree. Mental concepts have a high degree of correlation with the real world, otherwise we could not explain how we are so capable of navigating and manipulating the world to the degree that we do. So something that correlates with mental concepts necessarily correlates with things-in-the-world. Even things like emotions have real world function. Fear, for example, correlates with states in the world such that some alien species would be expected to have a corresponding concept.
>There is no way to tell the difference between a sword, a scalpel, a knife, a machete
There is some ambiguity here, but not as much as you claim. Machetes, for example, are mostly used in the context of "hacking", either vegetation or people, rather than precision cuts of a knife or a scalpel. These subtle differences in contextual usage would be picked up by a strong language model and a sufficient text corpus.