Reccently I was going to do a fairly big download of a dataset (45T) and when I first looked at it, figured I could shard the file list and run a bunch of parallel loaders on our cluster.
Instead, I made a VM with 120TB storage (using AWS with FSX) and ran a single instance of git clone for several days (unattended; just periodically checking in to make sure that git was still running). The storage was more than 2X the dataset size because git LFS requires 2X disk space. A single multithreaded git process was able to download at 350MB/sec and it finished at the predicted time (about 3 days). Then I used 'aws sync' to copy the data back to s3, writing at over 1GB/sec. When I copied the data between two buckets, the rate was 3GB/sec.
That said, there are things we simply can't do without distributed computing because there are strong limits on how many CPUs and local storage can be connected to a single memory address space.