You still need someone who understands why you should use which approach to get the data you need without getting completely wrong numbers back that _look_ perfectly fine but reflect fantasy, not reality.
it is still the early days. goal is to give the developer tools to do this easier.
It's not the early days.
Not by a country mile.
To quote Cory Doctorow
> I don’t see any path from continuous improvements to the (admittedly impressive) ”machine learning” field that leads to a general AI any more than I can see a path from continuous improvements in horse-breeding that leads to an internal combustion engine.
You can counter it doesn't necessarily need an AGI here but that doesn't change the fact you can't crank this engine harder and expect it to power an airplane.
And, as always https://hachyderm.io/@inthehands/112006855076082650
> You might be surprised to learn that I actually think LLMs have the potential to be not only fun but genuinely useful. “Show me some bullshit that would be typical in this context” can be a genuinely helpful question to have answered, in code and in natural language — for brainstorming, for seeing common conventions in an unfamiliar context, for having something crappy to react to.
> Alas, that does not remotely resemble how people are pitching this technology.
Similarly, but from my far-less notable-self in another discussion today:
> [H]uman exuberance is riding on the (questionable) idea that a really good text-correlation specialist can effectively impersonate a general AI.
> Even worse: Some people assume an exceptional text-specialist model will effectively meta-impersonate a generalist model impersonating a different kind of specialist!
therein lies the nuance. some people expect to get a natural language answer back. others expect to get a data table back. others expect to get correct SQL back. this is why it's so important to understand the use case and not bucket everything together.
Our standards for AI are too high.
If an autonomous car causes one wreck per ten million miles, people set the cars on fire.
When someone finds an LLM that suggests eating a small rock every day, that anecdote is used to discredit all LLM results.
This shit makes errors. But what is the alternative? Human analysts who get joins wrong four times in ten? Human drivers who cause wrecks 30 times per ten million miles? Human social media recommendations about nutritional supplements?
Decisions should be made against an alternative, not against some fictitious perfect solution.
The truth is everyone knows LLMs can't tell correct from error, can't tell real from imagined, and cannot care.
The word "hallucinate" has been used to explain when an LLM gets things wrong, when it's equally applicable to when it gets things right.
Everyone thinks the hallucinations can be trained out, leaving only edge cases. But in reality, edge cases are often horror stories. And an LLM edge case isn't a known quantity for which, say, limits, tolerances and test suites can really do the job. Because there's nobody with domain skill saying, look, this is safe or viable within these limits.
All LLM products are built with the same intention: we can use this to replace real people or expertise that is expensive to develop, or sell it to companies on that basis.
If it goes wrong, they know the excited customer will invest an unbillable amount of time re-training the LLM or double-checking its output -- developing a new unnecessary, tangential skill or still spending time doing what the LLM was meant to replace.
But hopefully you only need a handful of such babysitters, right? And if it goes really wrong there are disclaimers and legal departments.