Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
My benchmark for large language models
(opens in new tab)
(nicholas.carlini.com)
4 points
cheviethai123
2y ago
2 comments
Save
Share
2 comments
2 comments · 1 top-level
top
newest
oldest
cheviethai123
OP
2y ago
· 1 in thread
Consider how low the score of Gemini here compared to the other LLM test. And I'm impressed by the evaluation method's ability to assess performance without relying on tailored prompts.
hoamatcuoi
2y ago
But the benchmark only scoring Gemini-Pro 1, I'm curious how the Gemini Ultra performance here but guessed we couldn't know yet.
j
/
k
navigate · click thread line to collapse