If you think about it, that's probably the most difficult problem conversational LLMs need to overcome -- balancing sticking to conversational history vs abandoning it.
Humans do this intuitively.
But it seems really difficult to simultaneously (a) stick to previous statements sufficiently to avoid seeming ADD in a conveSQUIRREL and (b) know when to legitimately bail on a previous misstatement or something that was demonstrably false.
What's SOTA in how this is being handled in current models, as conversations go deeper and situations like the one referenced above arise? (false statement, user correction, user expectation of subsequent corrected statement that still follows the rear of the conversational history)