I've had my much greater success doing the RAG process on a fine tuned model.
No, especially people wrapping RAG frameworks around models they don’t have access to fine tune (e.g., GPT-4 if you aren't OpenAI and/or Microsoft.)
Edit: Of course, further evidence that this method is useful doesn't help people in that condition, though.
I cannot see the insight on why this is a for a limited domain? The key problem that is being solved is the known problem where RAG returns an irrelevant chunk. It seems like the "benefit" is training a model to ignore irrelevant chunks.
I am guessing because it costs money to train on multi-domains so they limited their research on one-domain at a time but not sure if there is a "bigger reason" why this isn't an approach to a fine-tuned "make answers from only relevant chunks" model? The paper seems to imply this is only works for specific-domains but I can't see why.
> We demonstrate that our RAG approach trains the model to perform better RAG on the set of documents it is trained on i.e., in-domain. By removing the oracle documents in some instances of the training data, we are compelling the model to memorize domain-knowledge.
What if you wanted to train it to say that it didn't find the answer?
interesting hypothesis, but probably not what was meant.
“editor” is shaping up to be the hot new job for fleshy consciousnesses by the early thirties.