undefined | Better HN

0 pointsfragmede1y ago0 comments

NVLink is what makes multiGPU work. It lets the GPUs talk to each other across a high bandwidth (600 Gbps), low latency link. Tensorflow and PyTorch both support it, among other things. It's not this weird thing that's a side note, the interconnect between nodes is what makes a supercomputer super. You don't hear about it much because you don't hear about a lot of details of supercomputer stuff in mainstream media.

0 comments

1 comments · 1 top-level

RockRobotRock1y ago

Thank you, but this doesn't really answer OPs or my question. Is NVLink required if you want to run an LLM model which exceeds the memory of a single GPU? What are the benchmark comparisons with and without it?

I've heard that NVLink helps with training, but not so much with inferencing.

j / k navigate · click thread line to collapse