Skip to content

Top Best Ask Show New Jobs

Feeding data to 1000 CPUs – comparison of S3, Google, Azure storage (opens in new tab)

(blog.zachbjornson.com)

216 pointsranrub10y ago72 comments

72 comments

50 comments · 13 top-level

jedberg10y ago· 10 in thread

AWS has a limit on the total throughput any one account can have to S3, so the more CPUs OP adds, the worse OPs performance will be on each one. I suspect the other providers have the same restriction.

I either missed it or OP didn't specify how many instances they was using at once to run their benchmark, but the more instances they used, the worse it will be per node.

This did not seem to be accounted for.

EDIT: OP says below it was from one instance, so what I said doesn't apply to this writeup.

BrandonY10y ago

This is not the case with Google Cloud Storage. I cannot speak to the other providers.

Google Cloud Storage does not limit read or write throughput with the exception of our "Nearline" product (and even Nearline's limiting can be suspended for additional cost, a feature called "On-Demand I/O").

jedberg10y ago

That's good to know, and definitely adds credence to my opinion that networking is the area where Google is definitely winning the Cloud Wars(tm)

zbjornson10y ago

All the benchmarks were from a single instance.

(Note that I have done some testing from AWS Lambda, where we had 1k lambda jobs all pulling down files from S3 at once. That's a bit harder to benchmark...)

jedberg10y ago

Hi OP, nice writeup! I hope my comment wasn't construed as dismissing the work, just a criticism of one small part.

It sounds like that wouldn't have been a factor, except for the cap you seem to have discovered on Amazon that you called out.

My only suggestion then is you may want to make it explicit that you ran the benchmarks from a single instance.

hrez10y ago

Any comments on how it worked out with Lambda?

colechristensen10y ago

Do you have any sources or more information about the per-account S3 limits?

jedberg10y ago

I don't have any published sources, it's something they told me, but it's hinted at here: http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-...

They explicitly mention the RPS per account limit in that doc, which is related.

nostromo10y ago

Take into account OP's former jobs. I imagine if anyone would run into such a limit, it would be Reddit or Netflix.

lowbloodsugar10y ago

If such a limit exists, it would not have been hit on such a small benchmark. However, I am unaware of any such limit and it has never been raised in any discussion I have had with them. I am responsible for a large compute and data storage platform backed by S3.

Is this a limit that is hit anywhere near the 150GB discussed in this article, or is it something that you hit only if you are Netflix? We have TB in S3 and have not observed any limit other than EC2 instance bandwidth.

jedberg10y ago

The amount of data one has in S3 isn't really relevant to the discussion, only how quickly you're trying to pull it into your instances.

jen2010y ago· 7 in thread

Has the author (if they are reading here) considered using Joyent's Manta to take the processing to the data instead?

vgt10y ago

There are plenty of architectures that do exactly this. EMR-on-S3, Google Dataproc on GCS, Snowflake-on-S3, BigQuery-on-GCS, etc etc.

The bigger point in the article is that these exact "take processing to the data" architectures operate exceedingly well on S3, GCS, Azure.

And, as a biased observer, these architectures operate on GCS the best due to great performance measured in the article, quick VM standup times, low VM prices, and per-minute billing.

zbjornson10y ago

I'm still trying to parse the docs and Manta source code to see what it actually does, but it seems unique if the data storage nodes are also the data processing nodes and no data transfer happens from some storage service before the job begins. The other key factor is having neither startup time nor the cost of a perpetually running cluster. Per my comment below [1], we have used Lambda with S3 to get something like this, as well as our own architecture built on plain EC2/GCE nodes.

[1] https://news.ycombinator.com/item?id=10846514

justinsaccount10y ago

As you sure you understand what "take the processing to the data" means?

EMR-on-S3 is the "copy the data to the processing nodes" variety.

linc01n10y ago

I think Manta is better if the result set is smaller than input set. So network performance won't matter that much. And also a per second pricing is better since the author need the result in 10 seconds.

Spinning up a cluster of VMs and use 10 seconds and they charge you min. 1 hour seems expensive to me.

dharbin10y ago

I don't know about Manta, but this is the entire point of HDFS. It easier to move code than data.

zeristor10y ago

Indeed, but they're having such fun. Let's leave them be.

zbjornson10y ago

Hadn't heard of it, looks cool. Thanks for the tip :)

oavdeev10y ago· 4 in thread

Stock Ubuntu needs SR-IOV driver to get to the actual bandwidth limit on ec2, it makes a lot of difference. We routinely get to ~2 Gbps down from S3 with that setup (using largest instance types).

edit: Gbps not GBps

rmcpherson10y ago

That's true, although the latest stock ubuntu HVM AMIs (14+, I believe) have the SR-IOV driver already and use it by default. Older AMIs need to have it installed and enabled on the AMI. I believe enhanced networking is only available on HVM amis.

oavdeev10y ago

This problem definitely existed with official 14.04 (HVM) AMI, though I haven't re-tested this recently, they may have fixed it. It did have some kind of SR-IOV driver but it was too old.

hrez10y ago

Good point for "enchanced networking" instances. I didn't see OS specified in the article. AMZN linux would have SR-IOV driver by default. PV vs HVM might also have an impact.

zbjornson10y ago

Per the comment here [1] and the linked twitter convo, I'll retest S3 with Amazon Linux soon. These tests used Ubuntu 14.04 on all providers, and did use HVM. My understanding is that this will possibly increase the network throughput of the VM, but the benchmarks stayed below the VM's capacity (which was the reason I included the charts of VM throughput).

[1] https://news.ycombinator.com/item?id=10846497

ChuckMcM10y ago· 4 in thread

When I see things like "data set size 150GB" and "1000 CPUS" I just naturally assume they are all in memory and never come from disk :-)

zbjornson10y ago

That's one of many data sets on the server, so unfortunately we can't keep them all in memory at once. :(

ChuckMcM10y ago

Lets assume when you're saying "cpu" when you mean "core" and your typical server class machine has 24 of those. A 1000 "cpus" is 41 machines, if they each donate 32GB to the cause[1] that is 1.3TB worth of data which is only a few microseconds away from any core.

I'm not sure why anyone would build a server with less than 96GB on it these days, so its not at all unreasonable. Now your service provider my jerk you around but you can run two racks of machines (48 machines) in a data center with specs like that for about $25K/month (including dual gigabit network pipes to your favorite IP transit provider) So it isn't even all that huge of an investment.

[1] Consider your typical 'memcached' type service where data is named as a function of IP and offset.

jacquesm10y ago

I think that data set is too small to constitute a good benchmark for the setup.

JoachimSchipper10y ago

You're not wrong, but apparently such a short burst is what they're actually doing in their application.

dwelch234410y ago· 3 in thread

I'd be interested to see how AWS' Elastic File System (EFS) compares (though I'd imagine it's not great, given it's mounted via NFS)

jaytaylor10y ago

No hard numbers for you, but FWIW I ran tests about 4 months ago and the performance was /very/ low compared to what is achievable compared to S3 and even normal NAS.

zbjornson10y ago

I've been on the list to get into their preview program for a while so I can benchmark it, actually! Part 3 of the blog post is going to include some NFS stuff either way.

acdha10y ago

When you do, it would be really useful to include the classic fio/bonnie/etc. stuff to break down performance by the type of operation (e.g. file creation / deletion, streaming read/write, random read/write) and block size.

EFS supports NFSv4 so it should avoid being as routinely limited by server round-trip latency as NFSv3 tends to be but it'd be nice to see how well that works in practice.

qaq10y ago· 3 in thread

WTF would one deploy such thing in the cloud?

cottonseed10y ago

Because renting 1000 cores for a limited time is much cheaper than buying them outright?

qaq10y ago

1000 cores of what ? Vcore is marketing BS. Even if it was not marketing BS it's 28 2U 3 node boxes (if using older cpus) or 14 2U 3 node boxes (if using more recent ones) unless they have extremely spiky workload using AWS is pointless. Bandwidth bound scientific apps ==> use infiniband cluster.

gtaylor10y ago

I spun up something like 200 "cores" to archive a large Cassandra cluster to Google Storage (Kubernetes cluster plus 200+ containers running the archive worker). Could have gone much bigger to get it done faster, but it wasn't necessary. ETL or archive jobs would be the most common case, to answer your question.

skywhopper10y ago· 2 in thread

Very interesting comparison, glad to see it. I don't have a comment on the content itself but I do have a note on the presentation.

The colors used for S3 and Azure Storage in the graphs are very near indistiguishable to me, as I have moderate red-green colorblindness. It's easier to tell apart on the bar graphs, since the patches of color are much larger, although I still have to work at it, and use the hints of the labels, but on the line graphs, it's basically impossible to tell apart. A darker shade of green would solve the problem for me personally, but I'm not all that bad a case, nor an expert on the best shades to pick for general color-blindness accessibility.

Just something to think about when presenting data like this.

mistermann10y ago

Color blind here as well, I had to zoom in incredibly close to distinguish the difference.

zbjornson10y ago

Thanks for pointing this out, and my apologies! Will fix that going forward.

hrez10y ago· 2 in thread

What missing from description is network setup. Is it ec2 classic, VPC? Is ec2 getting to s3 through IG? Hopefully not through NAT. There is also VPC endpoint to s3. Which all may have different performance profiles especially with multiple instances.

zbjornson10y ago

Network was VPC. The EC2 instance had an IG attached, yes, but I'm not sure if you're asking if an internal vs. external URL for S3 was used? Are you saying there's a better endpoint than s3-<region>.amazonaws.com for S3 requests from EC2?

hrez10y ago

I meant http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-en...

It's a private connection to AWS services including S3. You'd use the same URL as it's a routing basically. No idea if VPC endpoints would be better than IG though. P.S. Just tested and I get about half of the latency on VPC endpoint.

lowbloodsugar10y ago· 1 in thread

If you are pulling large files from S3 we have found that they can be sped up by requesting multiple ranges simultaneously. It is easy to hit 5Gb/s or 10Gb/s on instances with the necessary bandwidth, accessing a single file, or multiple files. We have not encountered a limit on S3 itself. YMMV.

hrez10y ago

Excellent https://github.com/rlmcpherson/s3gof3r is my tool of choice for "fast, parallelized, pipelined streaming access to Amazon S3."

If you want to saturate network bandwidth with S3 that's the one tool I know that can do it.

rmcpherson10y ago· 1 in thread

In S3 tests on c3.8xlarge instances, I've seen 8 Gbps throughput on both uploads and downloads using parallelized requests. Testing with iperf between two of the same instances maxed out about 8 Gbps as well so the throughput limitation is likely EC2 networking rather than S3.

These tests were done over a year ago so bandwidth limitations on EC2 may have changed since.

This testing was with https://github.com/rlmcpherson/s3gof3r

zbjornson10y ago

That's really cool. Wonder if the same technique (parallel streams) would help for Azure and GCS. I know GCS has some built-in capabilities for composite uploads/downloads, which might achieve a similar effect.

ranrubOP10y ago

with kernel tuning, S3 performance improves (and will probably improve on GC/Azure as well). Also, author uses Ubuntu 14.4 (see https://twitter.com/Zbjorn/status/684492084422688768), which doesn't use AWS "Enhanced networking" by default. Would be interesting to see results for tuned systems.

imperialdrive10y ago

Thanks for sharing your research - I've been up to the neck in EC2 migrations and trying to benchmark as I go... S3 is the neck chunk of work. Rock on!

frik10y ago

How reliable is Azure? For example the story of Gitlab on Azure was a disaster: https://news.ycombinator.com/item?id=10781263 Something like that wouldn't happen on AWS, GC, Softlayer, etc.

j / k navigate · click thread line to collapse