Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
WarmWash
2mo ago
0 comments
Save
Share
GLM 5.1, widely held up as the model at the heals, perhaps ever surpassing western models....
Gets 5% on ARC-AGI2 private set.
Chinese models are suspiciously good a benchmarks.
0 comments
1 comments · 1 top-level
top
newest
oldest
ctolsen
2mo ago
I mean, I could say the same about Gemini. 3.1 Pro tops a bunch of benchmarks out there but any practical use I've put it to it's underperforming both other proprietary and open weight models. Benchmarks are suspicious in general.
j
/
k
navigate · click thread line to collapse