Developing an LLM: Building, Training, Finetuning (A 1h Video Explainer)

Is anyone training LLMs outside of Meta, OpenAI, etc... ?

I don't much get the point. For huge models, it's impossible to outcompete them. For smaller models, isn't mistral or LLaMa good enough?

What are other startups finetuning LLMs for?

pcloadletter_2y ago

I find it can be nice to have an academic understanding of things you work with even if you don't have to develop it directly yourself.

rasbtOP2y ago

Agreed, understanding how a method works and how it would be done helps with developing an intuition for its limitations -- what it can and what it can't do

1 more reply

mdp20212y ago

You are probably forgetting that LLMs are not a final "end-of-history" thing, but a stage that calls for improvement, completion etc.

rasbtOP2y ago

I wouldn't pretrain from scratch, but continued pretraining is pretty popular for adapating LLMs to recent and/or custom data. (Sometimes this is referred to 'finetuning', however, not to be confused with 'instruction finetuning').

htrp2y ago· 1 in thread

Not Sebastian (who I assume is the OP), but his blog/substack is also a great resource

https://magazine.sebastianraschka.com/

rasbtOP2y ago

thanks for mentioning, that makes me super happy to hear!

mdp20212y ago· 1 in thread

Seems very good, thank you.

The channel: https://www.youtube.com/@SebastianRaschka/videos

contains hundreds of video lessons, originally seemingly originating from Sebastian Raschka teaching at Wisconsin-Madison Uni (before he went full-time entrepreneur).

rasbtOP2y ago

Thanks, glad that this is helpful!

oneshtein2y ago

Can someone train an AI to perform all that?

Developing an LLM: Building, Training, Finetuning (A 1h Video Explainer) (opens in new tab)

Developing an LLM: Building, Training, Finetuning (A 1h Video Explainer) (opens in new tab)

12 comments

12 comments