undefined | Better HN

0 pointshaldujai3y ago0 comments

While true I think this also misses that “for almost everyone else” you’re probably not (or at least should not) be trying to optimize zero-shot performance if you have an intended high inference use case so I don’t think Chinchilla would be all that relevant.

0 comments

2 comments · 1 top-level

vintermann3y ago· 1 in thread

I have a suspicion that good zero-shot performance is a good starting point for fine-tuning. If you have more than one intended high inference use case, or can imagine a couple of new ones on the horizon, it might still be best to not target the first use case directly.

haldujaiOP3y ago

Well yeah that’s kind of intuitive, my point is that if you just optimize for zero-shot you end up with something like GPT4 when an enterprise could probably be using finetuned LLaMA-7B with similar performance.

j / k navigate · click thread line to collapse