Show HN: Realtime LLM Chat on an 8GB Nvidia GPU (opens in new tab)

(github.com)

1 pointsz9913y ago0 comments

Demo runs on a laptop 3070 Ti / 8GB. GPU memory doesn't go above 6GB, so it might run on an even smaller GPU. Uses a 4-bit 7bn parameter alpaca_lora model and performance is significantly worse than ChatGPT as you'd expect.

0 comments

No comments yet.