If you are using LLM RAG – you should be doing RAFT (opens in new tab)

(techcommunity.microsoft.com)

52 pointsshishirpatil2y ago13 comments

13 comments

13 comments · 9 top-level

jnwatson2y ago· 2 in thread

Does any of this matter when you have 5 million token input and can just shove everything into the input?

imtringued2y ago

How to learn quadratic scaling the hard way.

jscheel2y ago

Yes, because 5 million token input is slow, expensive, and error-prone.

jondwillis2y ago· 1 in thread

Page me when this process is at least partially automated and continuous.

heyoni2y ago

Call me when there’s a nix flake for it!

matt3D2y ago· 1 in thread

Am I missing the point or is this not how everyone was doing RAG before?

I've had my much greater success doing the RAG process on a fine tuned model.

dragonwriter2y ago

> Am I missing the point or is this not how everyone was doing RAG before?

No, especially people wrapping RAG frameworks around models they don’t have access to fine tune (e.g., GPT-4 if you aren't OpenAI and/or Microsoft.)

Edit: Of course, further evidence that this method is useful doesn't help people in that condition, though.

tianjunz2y ago

This is a three-way collaboration between Berkeley AI, Microsoft Azure, and Meta AI! RAFT involves the concept of domain-specific RAG, which represents a more focused and growingly favored area compared to the broader concept of the general open-book exam. In such exams, the domain in which the LLM will be evaluated is known in advance and used for inference. The LLM is capable of addressing prompts by leveraging any and all information from this particular domain, on which it has been specifically fine-tuned.

Blogs: https://gorilla.cs.berkeley.edu/blogs/9_raft.html

petervandijck2y ago

It's "Retrieval Augmented Fine Tuning". The related blog post is interesting: https://gorilla.cs.berkeley.edu/blogs/9_raft.html

adampk2y ago

> Retrieval Aware Fine-Tuning (RAFT), presents a novel recipe to prepare fine-tuning data to tailor the models for domain-specific open-book setting, equivalent to in-domain RAG.

I cannot see the insight on why this is a for a limited domain? The key problem that is being solved is the known problem where RAG returns an irrelevant chunk. It seems like the "benefit" is training a model to ignore irrelevant chunks.

I am guessing because it costs money to train on multi-domains so they limited their research on one-domain at a time but not sure if there is a "bigger reason" why this isn't an approach to a fine-tuned "make answers from only relevant chunks" model? The paper seems to imply this is only works for specific-domains but I can't see why.

skybrian2y ago

From [1]:

> We demonstrate that our RAG approach trains the model to perform better RAG on the set of documents it is trained on i.e., in-domain. By removing the oracle documents in some instances of the training data, we are compelling the model to memorize domain-knowledge.

What if you wanted to train it to say that it didn't find the answer?

[1] https://gorilla.cs.berkeley.edu/blogs/9_raft.html

catchnear43212y ago

> They hypothesized that a student who studies the textbooks before the open-book exam was likely to perform better than a student who studies the textbook.

interesting hypothesis, but probably not what was meant.

“editor” is shaping up to be the hot new job for fleshy consciousnesses by the early thirties.

Charlie-Ji2y ago

RAFT is amazing!!

j / k navigate · click thread line to collapse