Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
androiddrew
3mo ago
0 comments
Save
Share
Could you share what you are using for inference and how you are running it? I have a 64G VRAM/128G system RAM setup.
0 comments
1 comments · 1 top-level
top
newest
oldest
sosodev
3mo ago
Most people are using something in the llama family for inference. Llama server is my go to. Unsloth guides describe how to configure inference for your model of choice.
j
/
k
navigate · click thread line to collapse