There are a lot of extremely legitimate concerns, like the environmental impact and so on.
But I just laugh when they point out that LLMs are merely clever regurgitators of their previous inputs… as if this isn’t how we as humans operate nearly all of the time. People realllllllllly want to think they’re special snowflakes.
Ask a human to plan a trip:
They do research, Pick destinations led by their own experience/likes/dislikes Compare to other guides Plan itineraries so they can get there Check and share
Ask an LLM to plan a trip:
It takes the prompt and continues it based on weights in the training data. If there is no data it picks the most likely thing (maybe made up). If there is it’ll mostly add things from that data. Maybe it’ll make tool calls and pull in data that way too but you can’t actually trust all the details.
These two processes are so different, it’s important to understand how they work, which is nothing like a human.
I don't trust LLMs for this application lol.
But I'd also want to point out that the way you're characterizing an LLM planning a trip doesn't have any structure to it, which indicates that in your scenario you're not using any kind of harness. I've been amazed at how capable even 30 billion parameter models are when I put them inside of a harness that provides structure and task management. If you consider that scenario, especially with the ability to search the web and use skills, suddenly the LLM looks a lot more like what the human process looks like.
Where humans and (current) LLMs differ the most is their failure mode. A human friend could be bad at planning trips, but that's kinda predictable, we're used to it, we know how to catch that Exception. LLMs on the other hand still have failure modes that come across as really wacky, like, what are they smoking in Mountain View?
Which might actually serve as better evidence of different internal workings at a deeper level, than just parroting well-known superficial features of stochastic whatevertheysay.
They're obviously achieved in drastically different ways at a low enough level; LLMs obviously do not simulate neurons or any biological construct. (For the record, I'm absolutely not one of those people who thinks LLMs are "alive" or should be treated like they are)
Reminds me of the olllllld days of Pentium II's when people got N64 emulation working shockingly quickly using HLE techniques. If you weren't around for this, it was quite the shocker at the time. I think the analogy is doubly apt, because HLE emulation has some serious limitations... it gets you maybe 80% of the way there really fast, and for the remaining 20% you need to roll up your sleeves and do serious LLE.
https://en.wikipedia.org/wiki/UltraHLE
It takes the prompt and continues it based on weights in
the training data. If there is no data it picks the most
likely thing (maybe made up). If there is it’ll mostly
add things from that data. Maybe it’ll make tool calls and
pull in data that way too but you can’t actually trust all
the details.
I'd like you to point out which bits of this are different from talking to humans. If you replace "training data" with "memories", this is pretty much exactly how things might go if you asked a friend (or perhaps a flaky travel agent) for travel advice.Note that I'm not arguing that LLMs are particularly talented at this particular use case. I'm pointing out that humans are also pretty unreliable.
You're also doing that thing where you point out that LLMs can be unreliable (yes, they are) without acknowledging how flawed nearly every other source of information is: people, websites, etc. I'm not defending LLMs in that regard... I'm just saying it's not a differentiator.