Good question!
So LLMs fundamentally do not encode the nature of facts and data. They're simply (well, no, extremely sophisticated) text prediction engines. There's no embedded comprehension. And that's basic to the approach companies like OpenAI are taking.
That's why it's so easy to find factual errors in the generated text from these models: they can create text that's pleasing, but that's as much as they can do.
The next leap, to generate language that's also accurate, will require new techniques to bake in actual understanding.
Until then, these models will always have issues with hallucinations.
Or, at least, that's my expectation. Certainly nothing we've seen in GPT 3, 3.5, or Bing, which is rumoured to be based on something close to GPT 4, indicates any advances in this area.