Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
LLM.int8(): 8-Bit Matrix Multiplication for Transformers at Scale | Better HN
LLM.int8(): 8-Bit Matrix Multiplication for Transformers at Scale
(opens in new tab)
(arxiv.org)
7 points
ofirpress
3y ago
1 comments
Share
1 comments
default
newest
oldest
ofirpress
OP
3y ago
Cool new efficient inference method that saves 2x memory and does not degrade performance for large language models!
More from the author about this at:
https://twitter.com/Tim_Dettmers/status/1559892888326049792
j
/
k
navigate · click thread line to collapse