undefined | Better HN

0 pointschoppaface5y ago0 comments

I feel like push-down operations might be a better analogy from the mapreduce world?

It strikes me these processors would be most helpful in pre-multiplies, filter operations, and perhaps for scatters. All that stuff is not just relevant to tensorflow / pytorch stuff but also databases. While I’m sure the “AI” labeling is pure marketing, I’d imagine Samsung would love to target workloads beyond deep learning training and inference.

0 comments

3 comments · 1 top-level

BenoitP5y ago· 2 in thread

> All that stuff is not just relevant to tensorflow / pytorch stuff but also databases.

Yes! and that's the beauty of it. It is not an accelerator, these are fully generic cores.

Not equivalent to 'smart' Intel cores with all the branch prediction, prefetching and caching magic; but with massive computation capabilities nonetheless.

GPUs do have massive amounts of memory (both in RAM and registers), but you have to have preloaded your stuff into it beforehand. And what you can actually do efficiently are SIMD operations.

I'd liken PIM to a better GPU-CPU blend: you get to keep your CPU doing its things with massive parallel operations concurrently. Also, these seem to be mostly independent cores, so you would not be limited to SIMD.

Let's bet: in 10 years, AWS will have a new offering: the 'nano lambda'. You get a PIM core share, with 10 MB local 'persistent' RAM (keeping your data + a continuation of your code when it is not running), running your tiny Loom thread [1], at the edge, billed at 1us granularity, only when it is running, and for 0.0000000000000001 USD per us.

[1] https://cr.openjdk.java.net/~rpressler/loom/Loom-Proposal.ht...

choppafaceOP5y ago

Huh I don't know about 'nano lambda', but I could see it being an add-on to S3. Customers already have petabytes upon petabytes there and if you could push down a filter or pre-multiply op and save a million bucks then sure people would do that. If EMR does it automagically then even better.

When could it happen? My guess is S3 favors very old compute hardware but I could be wrong. If it does, though, that could also mean it's just very cheap to replace ;)

BenoitP5y ago

For predicate pushdown in S3, I'd see that operation in the SSD controller. You wouldn't even have to load the date in RAM before filtering it.

j / k navigate · click thread line to collapse