undefined | Better HN

0 pointszozbot2345mo ago0 comments

It's SIMD-based at the lowest level, but there's also the use of very high hardware multithreading (the threads are called, AIUI, "wavefronts" or "warps") on each compute unit/stream processor to hide memory access latency. Recent SPARC CPU's have 8-way hardware multithreading on the individual CPU core, GPU's can easily go even higher than that.

0 comments

fulafel5mo ago

Yep, this also reflects the design target of GPUs targeting much larger working sets, so have higher main memory bandwidth and rely less on caches. CPUs rather have few fast threads of execution working on hot cached data than many slow ones talking to main memory (because N-way thread level parallelism often splits your cache N ways, to N working sets)

j / k navigate · click thread line to collapse

0 pointszozbot2345mo ago0 comments

0 comments

fulafel5mo ago

j / k navigate · click thread line to collapse