Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
imiric
4mo ago
0 comments
Share
Or, 2b: the nerf is real, but benchmarks are gamed and models are trained to excel at them, yet fall flat in real world situations.
0 comments
default
newest
oldest
metalliqaz
4mo ago
I mostly stay out of the LLM space but I thought it was an open secret already that the benchmarks are absolutely gamed.
j
/
k
navigate · click thread line to collapse