I'm still trying to parse the docs and Manta source code to see what it actually does, but it seems unique if the data storage nodes are also the data processing nodes and no data transfer happens from some storage service before the job begins. The other key factor is having neither startup time nor the cost of a perpetually running cluster. Per my comment below [1], we have used Lambda with S3 to get something like this, as well as our own architecture built on plain EC2/GCE nodes.
[1] https://news.ycombinator.com/item?id=10846514