Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
joelthelion
2y ago
0 comments
Share
I wonder how that weird HellaSwag lag is possible. Is there something really special about that benchmark?
0 comments
default
newest
oldest
HereBePandas
2y ago
Tech report seems to hint at the fact that GPT-4 may have had some training/testing data contamination and so GPT-4 performance may be overstated.
1 more reply
erikaww
2y ago
yeah a lot of local models fall short on that benchmark as well. I wonder what was different about GPT3.5/4's training/date that would lead to its great hellaswag perf
j
/
k
navigate · click thread line to collapse