Case in point: Self driving cars.
Also, consider that we need to pirate the whole internet to be able to do this, so these models are not creative. They are just directed blenders.
You'd need to prove that this assertion applies here. I understand that you can't deduce the future gains rate from the past, but you also can't state this as universal truth.
Knowledge engineering has a notion called "covered/invisible knowledge" which points to the small things we do unknowingly but changes the whole outcome. None of the models (even AI in general) can capture this. We can say it's the essence of being human or the tribal knowledge which makes experienced worker who they are or makes mom's rice taste that good.
Considering these are highly individualized and unique behaviors, a model based on averaging everything can't capture this essence easily if it can ever without extensive fine-tuning for/with that particular person.
Self-driving cars don't use LLMs, so I don't know how any rational analysis can claim that the analogy is valid.
>> The saying I have quoted (which has different forms) is valid for programming, construction and even cooking. So it's a simple, well understood baseline.
Sure, but the question is not "how long does it take for LLMs to get to 100%". The question is, how long does it take for them to become as good as, or better than, humans. And that threshold happens way before 100%.
None of the current models maybe, but not AI in general? There’s nothing magical about brains. In fact, they’re pretty shit in many ways.
I've heard this same thing repeated dozens of times, and for different domains/industries.
It's really just a variation of the 80/20 rule.
This is clear from the fact that you can distill the logic ability from a 700b parameter model into a 14b model and maintain almost all of it.
You just lose knowledge, which can be provided externally, and which is the actual "pirated" part.
The logic is _learned_
That models can be distilled has no bearing whatsoever on whether a model has learned actual knowledge or understanding ("logic"). Models have always learned sparse/approximately-sparse and/or redundant weights, but they are still all doing manifold-fitting.
The resulting embeddings from such fitting reflect semantics and semantic patterns. For LLMs trained on the internet, the semantic patterns learned are linguistic, which are not just strictly logical, but also reflect emotional, connotational, conventional, and frequent patterns, all of which can be illogical or just wrong. While linguistic semantic patterns are correlated with logical patterns in some cases, this is simply not true in general.
Of course the tech will be useful and ethical if these problems are solved or decided to be solved the right way.