Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
tatef
3mo ago
0 comments
Save
Share
Hypura reads tensor weights from the GGUF file on NVMe into RAM/GPU memory pools, then compute happens entirely in RAM/GPU.
There is no writing to SSDs on inference with this architecture.
0 comments
1 comments · 1 top-level
top
newest
oldest
embedding-shape
3mo ago
Even if there was a ton of writing, I'm not sure where NVMe even comes in the picture, write durability is about the flash cells on SSDs, nothing to do with the interface, someone correct me if I'm wrong.
j
/
k
navigate · click thread line to collapse