Skip to content

Top Best Ask Show New Jobs

LLM in a Flash: Efficient Large Language Model Inference with Limited Memory (opens in new tab)

(arxiv.org)

12 pointskeep_reading2y ago1 comments

1 comments

1 comments · 1 top-level

dang2y ago

LLM in a Flash: Efficient LLM Inference with Limited Memory - https://news.ycombinator.com/item?id=38704982 - Dec 2023 (52 comments)

j / k navigate · click thread line to collapse