Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
csomar
1y ago
0 comments
Share
Models are predictable at 0 temperatures. They might have tested the output beforehand.
0 comments
default
newest
oldest
fzzzy
1y ago
Models in practice haven't been deterministic at 0 temperature, although nobody knows exactly why. Either hardware or software bugs.
Jensson
1y ago
We know exactly why, it is because floating point operations aren't associative but the GPU scheduler assumes they are, and the scheduler isn't deterministic. Running the model strictly hurts performance so they don't do that.
fzzzy
1y ago
Cool, thanks a lot for the explanation. Makes sense.
j
/
k
navigate · click thread line to collapse