There's several features that make Arvados unique.
* The content addressed storage system references hashes all data so you can unambiguously reference an immutable data set, similar to git but capable of handling huge invidual files (hundreds of gigabytes) and scaling to petabytes.
* Every compute job is recorded in a database with hashes identifying the inputs, Docker image, and outputs, so re-running past jobs is easy.
* Designed to federate multiple instances, to support both "hybrid cloud" setups within an organization, and allowing different organizations to share data.
These are all features that are particularly important to the bioinformatics community, but solve problems that are common to lots of informatics big data problems.