undefined | Better HN

0 pointsToValueFunfetti11mo ago0 comments

If the only calculators that existed failed at 5% of the calculations, or if the only communication tools miscommunicated 5% of the time, we would still use both all the time. They would be far less than 95% as useful as perfect versions, but drastically better then not having the tools at all.

0 comments

gitremote11mo ago

Absolutely not. We'd just do the calculations by hand, which is better than running the 95%-correct calculator and then doing the calculations by hand anyway to verify its output.

ToValueFunfettiOP11mo ago

Suppose you work in a field where getting calculations right is critical. Your engineers make mistakes less than .01% of the time, but they do a lot of calculations and each mistake could cost $millions or lives. Double- and triple-checking help a lot, but they're costly. Here's a machine that verifies 95% of calculations, but you'd still have to do 5% of the work. Shall I throw it away?

Unreliable tools have a good deal of utility. That's an example of them helping reduce the problem space, but they also can be useful in situations where having a 95% confidence guess now matters more that a 99.99% confidence one in ten minutes- firing mortars in active combat, say.

There's situations where validation is easier than computation; canonically this is factoring, but even division is much simpler than multiplication. It could very easily save you time to multiply all of the calculator's output by the dividend while performing both a multiplication and a division for the 5% that are wrong.

edit: I submit this comment and click to go the front page and right at the top is Unsure Calculator (no relevance). Sorry, I had to mention this

diputsmonro11mo ago

> Here's a machine that verifies 95% of calculations, but you'd still have to do 5% of the work.

The problem is that you don't know which 5% are wrong. The AI is confidently wrong all the time. So the only way to be sure is to double check everything, and at some point its easier to just do it the right way.

Sure, some things don't need to be perfect. But how much do you really want to risk? This company thought a little bit of potential misinformation was acceptable, and so it caused a completely self inflicted PR scandal, pissed off their customer base, and lost them a lot of confidence and revenue. Was that 5% error worth it?

Stories like this are going to keep coming the more we rely on AI to do things humans should be doing.

Someday you'll be affected by the fallout of some system failing because you happen to wind up in the 5% failure gap that some manager thought was acceptable (if that manager even ran a calculation and didn't just blindly trust whatever some other AI system told them) I just hope it's something as trivial as an IDE and not something in your car, your bank, or your hospital. But certainly LLMs will be irresponsibly shoved into all three within the next few years, if it's not there already.

1 more reply

Tainnor11mo ago

> Unreliable tools have a good deal of utility.

This is generally true when you can quantify the unreliability. E.g. random prime number tests with a specific error rate can be combined so that the error rates multiply and become negligible.

I'm not aware that we can quantify the uncertainty coming out of LLM tools reliably.

mrheosuper11mo ago

> you'd still have to do 5% of the work

No, you still have to do 100% of the work.

1 more reply

jimbokun11mo ago

> Here's a machine that verifies 95% of calculations

Which 95% did it get right?

j / k navigate · click thread line to collapse

0 comments

gitremote11mo ago

Absolutely not. We'd just do the calculations by hand, which is better than running the 95%-correct calculator and then doing the calculations by hand anyway to verify its output.

ToValueFunfettiOP11mo ago

edit: I submit this comment and click to go the front page and right at the top is Unsure Calculator (no relevance). Sorry, I had to mention this

diputsmonro11mo ago

> Here's a machine that verifies 95% of calculations, but you'd still have to do 5% of the work.

Stories like this are going to keep coming the more we rely on AI to do things humans should be doing.

1 more reply

Tainnor11mo ago

> Unreliable tools have a good deal of utility.

This is generally true when you can quantify the unreliability. E.g. random prime number tests with a specific error rate can be combined so that the error rates multiply and become negligible.

I'm not aware that we can quantify the uncertainty coming out of LLM tools reliably.

mrheosuper11mo ago

> you'd still have to do 5% of the work

No, you still have to do 100% of the work.

1 more reply

jimbokun11mo ago

> Here's a machine that verifies 95% of calculations

Which 95% did it get right?

j / k navigate · click thread line to collapse