undefined | Better HN

0 pointsTheRealPomax2y ago0 comments

With what level of accuracy? And what guarantee of correctness? Because a report that happens to get the joins wrong once every 1000 reports is going to lead to fun legal problems.

You still need someone who understands why you should use which approach to get the data you need without getting completely wrong numbers back that _look_ perfectly fine but reflect fantasy, not reality.

0 comments

10 comments · 3 top-level

saigal2y ago· 6 in thread

i agree that there will be "early adopter" type use cases and others that might take a while (e.g. healthcare with hipaa compliance)

it is still the early days. goal is to give the developer tools to do this easier.

chx2y ago

Enough of this weasel talk.

It's not the early days.

Not by a country mile.

To quote Cory Doctorow

> I don’t see any path from continuous improvements to the (admittedly impressive) ”machine learning” field that leads to a general AI any more than I can see a path from continuous improvements in horse-breeding that leads to an internal combustion engine.

You can counter it doesn't necessarily need an AGI here but that doesn't change the fact you can't crank this engine harder and expect it to power an airplane.

And, as always https://hachyderm.io/@inthehands/112006855076082650

> You might be surprised to learn that I actually think LLMs have the potential to be not only fun but genuinely useful. “Show me some bullshit that would be typical in this context” can be a genuinely helpful question to have answered, in code and in natural language — for brainstorming, for seeing common conventions in an unfamiliar context, for having something crappy to react to.

> Alas, that does not remotely resemble how people are pitching this technology.

Terr_2y ago

> can't crank this engine harder and expect it to power an airplane.

Similarly, but from my far-less notable-self in another discussion today:

> [H]uman exuberance is riding on the (questionable) idea that a really good text-correlation specialist can effectively impersonate a general AI.

> Even worse: Some people assume an exceptional text-specialist model will effectively meta-impersonate a generalist model impersonating a different kind of specialist!

warkdarrior2y ago

Indeed, AI is not marketed as a BS generator, just as HTTP is not marketed as a spam/ad/fraud/harassment transport protocol. All technologies are dual-use, deal with it!

1 more reply

altdataseller2y ago

Its not the early days in terms of expecting digital tools to be correct 99% of the time. Early adoption age was back in 2000-2009. Now everyone expects polished tools that does what it expects them to do

saigal2y ago

"...what it expects them to do"

therein lies the nuance. some people expect to get a natural language answer back. others expect to get a data table back. others expect to get correct SQL back. this is why it's so important to understand the use case and not bucket everything together.

saigal2y ago

if you expect correct 99% of the time, you will be waiting for a very very very long time for most, except for the most constrained, use cases

panarky2y ago· 1 in thread

Getting joins wrong once in 1000 queries would beat 99.9% of experienced data analysts.

Our standards for AI are too high.

If an autonomous car causes one wreck per ten million miles, people set the cars on fire.

When someone finds an LLM that suggests eating a small rock every day, that anecdote is used to discredit all LLM results.

This shit makes errors. But what is the alternative? Human analysts who get joins wrong four times in ten? Human drivers who cause wrecks 30 times per ten million miles? Human social media recommendations about nutritional supplements?

saigal2y ago

The autonomous car analogy is a good one. The technology is overall so far superior to a human (probably scrolling TikTok) driving but the moment it makes a mistake we remove the AEV which would be to to higher societal benefit.

Decisions should be made against an alternative, not against some fictitious perfect solution.

ecjhdnc20252y ago

> With what level of accuracy? And what guarantee of correctness? Because a report that happens to get the joins wrong once every 1000 reports is going to lead to fun legal problems.

The truth is everyone knows LLMs can't tell correct from error, can't tell real from imagined, and cannot care.

The word "hallucinate" has been used to explain when an LLM gets things wrong, when it's equally applicable to when it gets things right.

Everyone thinks the hallucinations can be trained out, leaving only edge cases. But in reality, edge cases are often horror stories. And an LLM edge case isn't a known quantity for which, say, limits, tolerances and test suites can really do the job. Because there's nobody with domain skill saying, look, this is safe or viable within these limits.

All LLM products are built with the same intention: we can use this to replace real people or expertise that is expensive to develop, or sell it to companies on that basis.

If it goes wrong, they know the excited customer will invest an unbillable amount of time re-training the LLM or double-checking its output -- developing a new unnecessary, tangential skill or still spending time doing what the LLM was meant to replace.

But hopefully you only need a handful of such babysitters, right? And if it goes really wrong there are disclaimers and legal departments.

j / k navigate · click thread line to collapse

0 comments

10 comments · 3 top-level

saigal2y ago· 6 in thread

i agree that there will be "early adopter" type use cases and others that might take a while (e.g. healthcare with hipaa compliance)

it is still the early days. goal is to give the developer tools to do this easier.

chx2y ago

Enough of this weasel talk.

It's not the early days.

Not by a country mile.

To quote Cory Doctorow

You can counter it doesn't necessarily need an AGI here but that doesn't change the fact you can't crank this engine harder and expect it to power an airplane.

And, as always https://hachyderm.io/@inthehands/112006855076082650

> Alas, that does not remotely resemble how people are pitching this technology.

Terr_2y ago

> can't crank this engine harder and expect it to power an airplane.

Similarly, but from my far-less notable-self in another discussion today:

> [H]uman exuberance is riding on the (questionable) idea that a really good text-correlation specialist can effectively impersonate a general AI.

> Even worse: Some people assume an exceptional text-specialist model will effectively meta-impersonate a generalist model impersonating a different kind of specialist!

warkdarrior2y ago

Indeed, AI is not marketed as a BS generator, just as HTTP is not marketed as a spam/ad/fraud/harassment transport protocol. All technologies are dual-use, deal with it!

1 more reply

altdataseller2y ago

saigal2y ago

"...what it expects them to do"

saigal2y ago

if you expect correct 99% of the time, you will be waiting for a very very very long time for most, except for the most constrained, use cases

panarky2y ago· 1 in thread

Getting joins wrong once in 1000 queries would beat 99.9% of experienced data analysts.

Our standards for AI are too high.

If an autonomous car causes one wreck per ten million miles, people set the cars on fire.

When someone finds an LLM that suggests eating a small rock every day, that anecdote is used to discredit all LLM results.

saigal2y ago

Decisions should be made against an alternative, not against some fictitious perfect solution.

ecjhdnc20252y ago

> With what level of accuracy? And what guarantee of correctness? Because a report that happens to get the joins wrong once every 1000 reports is going to lead to fun legal problems.

The truth is everyone knows LLMs can't tell correct from error, can't tell real from imagined, and cannot care.

The word "hallucinate" has been used to explain when an LLM gets things wrong, when it's equally applicable to when it gets things right.

All LLM products are built with the same intention: we can use this to replace real people or expertise that is expensive to develop, or sell it to companies on that basis.

But hopefully you only need a handful of such babysitters, right? And if it goes really wrong there are disclaimers and legal departments.

j / k navigate · click thread line to collapse