You can model it as a state machine, where the LLM decides to what state it wants to advance. In terms of developer ergonomics, strongly typed outputs help. You can for example force a function call at each step, where one of the call arguments is an enum specifying the state to advance to.
Shoot me an email if you want to discuss specifics!