As a power user myself, LLMs don't feel like tools I can depend on. I try to use them for well-bounded low-stakes tasks like coming up with sports trivia and generating boilerplate "hello world" code for arbitrary targets (e.g., NES 6502), and they stink at it. Hallucinations aren't a problem you can just wave away because accuracy matters for most tasks. LLMs are less a hammer and chisel, and more of a slot machine that may or may not barf out something of value to me. If they fail at these simple tasks, I'd be a fool to rely on them for anything more substantial.
It’s interesting how varied experiences are. I don’t dismiss hallucinations, but my workflows avoid them by design—I’d never treat the model as a knowledge source, like generating trivia questions directly from it. So I wonder if it’s also about expectations and understanding of limitations. From my perspective I would never create queries like yours without supporting data sets.