Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
Understanding Emergent Abilities of Language Models from the Loss Perspective
(opens in new tab)
(arxiv.org)
6 points
maccaw
1y ago
1 comments
Share
Understanding Emergent Abilities of Language Models from the Loss Perspective | Better HN
1 comments
default
newest
oldest
cosmojg
1y ago
Does this mean that "overtraining" a midsize LLM for many more epochs on a small, representative subset of the dataset used by a larger, more performant LLM might be sufficient for matching the performance of the larger model?
j
/
k
navigate · click thread line to collapse