undefined | Better HN

0 pointsjmull2y ago0 comments

How do you know the LLM is going to do what you confirmed?

There's a fundamental tension here: either you limit the LLM to a set of fixed actions that a user can individually understand and confirm. Or you let it figure out what to do and how to do it given a higher-level goal.

In the first case, it's limited and not really better than a well-designed site or app. In the second it's powerful but can run amok.

E.g., did it really fill out that four page order form for that e-bike you asked it to order? Maybe it uses the debit card instead of your credit card and your checking account is overdrawn. Maybe it sets the delivery address to your brother's address. Maybe it orders two bikes or 10 or the wrong model or wrong size.

It asks/confirms each step of the way so there's little chance for mistakes.

This is an inherent problem with any kind of delegation. You either micro manage the details or trust your agent to get them right.

0 comments

3 comments · 1 top-level

nabakin2y ago· 2 in thread

The whole system isn't an LLM. At the point they are asking for confirmation, they've already parsed the required information and handed it off to normal code. It's not going to change again.

Ultimately, an LLM only has the capability to take a string as input and output another string. Any other functionality (Uber, search, travel, etc) has to be manually programmed in via APIs. For the foreseeable future, a fixed set of actions is the only way to do this reliably and is what this device does so talking about the second case of an LLM run amok is a moot point.

carlhung2y ago

I also don't understand why they call it LLM. it isn't.

wesleychen2y ago

I think they call it a Large Action Model not an LLM.

1 more reply

j / k navigate · click thread line to collapse