> individual 'hallucinations' can't be treated as bugs to troubleshoot
You are wrong here - my company can fix individual responses by adding specific targeted data for the RAG prompt. So a JIRA ticket for a wrong response can be fixed in 2 days.
It's important to understand that you're addressing the problem by adding a layer on top of the core technology, to mitigate or mask how it actually works.
At scale, your solution looks like bolting an expert system on top of the LLM. Which is something that some researchers and companies are actually working on.
Wow, that sounds great: just have every customer who interacts with your LLM come back to the site in 2 days to get the real answer to their question. How can I invest?
I've said before, but I'm not convinced LLM should be public facing. I know some companies have been burned by them and in my opinion, LLM should be about helping customer support people find answers faster.