undefined | Better HN

0 pointsriquito5y ago0 comments

He tried with few effort and noticed that for his use case the code is faster, I fail to understand this rebuttal of the parent's comment

0 comments

5 comments · 2 top-level

rbanffy5y ago· 3 in thread

The average cellphone today has more than 4 cores. A decent desktop can deal with 16 threads on 8 cores.

There is a lot of untapped parallelism readily available waiting for the right code.

ShinTakuya5y ago

It's not about number of available threads, the very act of scheduling tasks across multiple threads has scheduling and communication overheads, and in many situations actually ends up being slower than running it on the same thread.

That said, I think the original comment was rightly pointing out how easy it was to make the change and test it, which in this case did turn out to be noticeably faster.

PicassoCTs5y ago

Parallelization is the nuclear energy of comp science. Loads of potential, high risk reward and they would have gotten away with it, if it were not for those meddling humans. Its non-trivial and can only be handled by accomplished engineers. Thus it is not used - or is encapsulated, out of sight, out of reach of meddling hands. (CPU Microcode shovelling non-connected work to pipelines comes to mind / NN-Net training frameworks, etc.)

eru5y ago

> [...] or is encapsulated, out of sight, out of reach of meddling hands.

That's the real issue here! Most language have poor abstractions for parallelism and concurrency. (Many languages don't even differentiate between the two.)

Encapsulating and abstracting is how we make things usable.

Eg letting people roll hash tables by themselves every time they want to use one, would lead to people shooting themselves in the foot more often than not. Compared to that, Python's dicts are dead simple to use. Exactly because they move all the fiddly bits out of the reach of meddling hands.

1 more reply

fulafel5y ago

As a general thought about parallelizing all the things it's true though. When looking for speedups, parallelization granularity has to be tuned and iterated with benchmarking, else your speedups will be poor or negative.

I think the example case in this subthread was about making some long app operations asynchronous and overlapping, which is a more forgiving use case than trying to make a piece of code faster by utilizing multiple cores.

j / k navigate · click thread line to collapse