undefined | Better HN

0 pointsolliej4y ago0 comments

The reason they run on a GPU isn’t spite. It’s because the work for neural net based ML is inherently dependent on vast amounts of independent floating point operations.

CPUs tend to have very few FPUs per core, so you max out a modern systems CPUs idealised throughput at maybe 40-80 concurrent streams. On top of that the FPUs on a CPU are generally require to perform fully compliant ieee754 arithmetic at at least 32bit of precision.

Modern GPUs can have that number of FPUs per hardware thread and then have a few hundred of those hardware threads. Each of those GPU FPUs are also faster as they can both elide some elements of ieee754, and operate at lower precision (fp16) to get even more performance.

So you could read the paper, and implement it on a CPU and the very best that you, or anyone, could do would be literal orders of magnitude slower than the GPU implementation.

That’s why you don’t see them doing it on a CPU, let alone in Python.

0 comments

2 comments · 1 top-level

adammarples4y ago· 1 in thread

The reason it runs onGPU is because this research was literally done by NVIDIA!

olliejOP4y ago

Nvidia also makes CPUs.

The reason the research is coming out of nvidia is because this kind of research is inherently GPU limited. So if it came out of AMD, Intel, Google, or Apple, it would be dependent on either GPU, or non-programmable NN specific hardware. If it came out of academia it would still be on a GPU, because none of this is remotely practical on a CPU.

j / k navigate · click thread line to collapse