The other thing I’ve noticed is something you alluded to: the LLM being “confidently incorrect”. It speaks so authoritatively about things and when I call it out it agrees and corrects.
The more I use these things (I try to ask against multiple LLMs) the more I am wary of the output. And it seems that companies over the past user rushed to jam chatbots into any orifice of their application where they could. I’m curious to see if the incorrectness of them will start to have a real impact.