There's a fundamental tension here: either you limit the LLM to a set of fixed actions that a user can individually understand and confirm. Or you let it figure out what to do and how to do it given a higher-level goal.
In the first case, it's limited and not really better than a well-designed site or app. In the second it's powerful but can run amok.
E.g., did it really fill out that four page order form for that e-bike you asked it to order? Maybe it uses the debit card instead of your credit card and your checking account is overdrawn. Maybe it sets the delivery address to your brother's address. Maybe it orders two bikes or 10 or the wrong model or wrong size.
OR
It asks/confirms each step of the way so there's little chance for mistakes.
This is an inherent problem with any kind of delegation. You either micro manage the details or trust your agent to get them right.