undefined | Better HN

0 pointsoersted1y ago0 comments

It is an exciting concept, there's a huge wealth of gaming hardware deployed that is inactive at most hours of the day. And I'm sure people are willing to pay well above the electricity cost for it.

Unfortunately, the dominant LLM architecture makes it relatively infeasible right now.

- Gaming hardware has too limited VRAM for training any kind of near-state-of-the-art model. Nvidia is being annoyingly smart about this to sell enterprise GPUs at exorbitant markups.

- Right now communication between machines seems to be the bottleneck, and this is way worse with limited VRAM. Even with data-centre-grade interconnect (mostly Infiniband, which is also Nvidia, smart-asses), any failed links tend to cause big delays in training.

Nevertheless, it is a good direction to push towards, and the government could indeed help, but it will take time. We need both a more healthy competitive landscape in hardware, and research towards model architectures that are easy to train in a distributed manner (this was also the key to the success of Transformers, but we need to go further).

0 comments

sharpshadow1y ago

Couldn’t VRAM be subsidised with SSDs on a lower end machine? It would make it slower but maybe useful at last.

oerstedOP1y ago

Perhaps, the landscape has improved a lot in the last couple of years, there are lots of implementation tricks to improve efficiency on consumer hardware, particularly for inference.

Although it is clear that the computing capacity of the GPU would be very underutilized with the SSD as the bottleneck. Even using RAM instead of VRAM is pretty impractical. It might be a bit better for chips like Apple's where the CPU, RAM and GPU are all tightly connected on the same SoC, and the main RAM is used as the VRAM.

Would that performance be still worth more than the electricity cost? Would the earnings be high enough for a wide population to be motivated to go through the hassle of setting up their machine to serve requests?