undefined | Better HN

0 pointssolidasparagus2y ago0 comments

How much of that is just the flood of traditional engineers into the space and the fact that collecting data and then fine-tuning models is orders of magnitude more complex than just throwing in RAG? I suspect a huge amount of RAG's popularity is just that any engineer can do a version of it + ChatGPT API calls in a day.

As for lora - in the context of my comment, that's just splitting hairs IMO. It falls in the category of finetuning for me, although I understand why you might disagree. But it's not like the article mentions lora either, nor am I aware of people doing lora without GPUs which the article is against (No GPUs before PMF)

0 pointssolidasparagus2y ago0 comments

0 comments

4 comments · 1 top-level

altdataseller2y ago· 3 in thread

I disagree. No amount of fine tuning will ever give the LLM the relevant context with which to answer my question. Maybe if your context is a static Wikipedia or something that will never change, you can fine tune it. But if your data and docs keep changing, how is fine tuning going to be better than RAG?

idf002y ago

Luckily it's not one or the other. You can fine tune and use RAG.

Sometimes RAG is enough. Sometimes fine tuning on top of RAG is better. It depends on the use case. I can't think of any examples where you would want to fine tune and not use rag as well.

Sometimes you fine tune a small model so it performs close to a larger varient on that specific narrow task and you improve inference performance by using a smaller model.

solidasparagusOP2y ago

Continuous retraining and deployment maybe? But I'm actually not anti-RAG (although I think it is overrated because the retrieval problem is still handled extremely naively), I just think that fine-tuning should also be in your toolkit.

altdataseller2y ago

Why is the retrieval part overrated? There isnt even a single way to retrieve. It could be a simple keyword sesrch, a vector sesrch, a combo, or just simply retrieving a single doc and stuffing it in the context

1 more reply

j / k navigate · click thread line to collapse