story
But this thread is about misuse of the term as applied to the weights package. Those of us who know what open source means should not continue to dilute the term by calling these LLMs by that term.
It's just like even for a true open source software you still need to bring your own hardware to run it on.
But we don't actually know all that much about how language really works, for all the resources we spend on linguistics - as the old IBM joke about AI goes, "quality of the product increases every time we fire a linguist" (which is to say, we consistently get better results by throwing "every written word known to man" at a blank model than we do by trying to construct things from our understanding).
All that said, just because we're taking a different, and quite possibly slower / less compute-efficient route, doesn't mean that we can't get to AGI in this way.
No, we can’t few shot it and we don't get there faster (but we develop a lot of other capabilities on the way.) We train on a lot more data; the human brain, unlike an LLM, is training on all that data in processes for ”inference”, and it receives sensory data estimated on the order of a billion bits per second, which means by the time we start using language we’ve trained on a lot of data (the 15 trillion tokens from a ~17 bit token vocabulary that Llama3 is something like the size of a few days of human sense data.) Humans just are trained on and process vastly richer multimodal data instead of text streams.