undefined | Better HN

0 pointsDer_Einzige3y ago0 comments

In general, most giant LLMs are extremely undertrained at this time. Consider that most of the gains in RoBerta vs bert were from just continuing to train.

0 comments

2 comments · 2 top-level

stevenhuang3y ago

Cases of undertraining can be observed whenever the output is repeating gibberish or loops. Happened a lot in GPT2 ai dungeon days

leobg3y ago

So can we continue training RoBERTa to get it to, say, GPT3 Ada level

j / k navigate · click thread line to collapse