Agreed. GPT-3 and GPT-3.5 commonly hallucinate. GPT-4 can certainly be made to behave badly, but on real questions I've put to GPT-4 it has a 0% hallucination rate. The few wrong answers it has given have been "sensibly wrong" in that it's highly likely an experienced human programmer would have made the same mistake (eg lots of Stack Overflow answers are wrong in the same way), and even its wrong answers have been helpful in guiding me towards the correct solution.
These occasional, "sensibly wrong" GPT-4 answers are fundamentally different from the correctly formatted academic bibliography citations for technical papers that never existed, by authors that never existed, in journals that never existed hallucinated "answers" I've received from GPT-3 and GPT-3.5.