The Amazon Prime Day 2023 AWS Bill (opens in new tab)

(lastweekinaws.com)

147 pointsbpugh2y ago62 comments

62 comments

  Amazon Prime Day event resulted in an incremental 163 petabytes of EBS storage capacity allocated – generating a peak of 15.35 trillion requests and 764 petabytes of data transfer per day.

The main thing that strikes me is how (seemingly) inefficient everything is. What do they possibly need this amount of data for in selling stuff? Are they taking high-def video of every customer as they browse for something to buy? I get that it's a huge company and this is (I guess) their business time, but how can the y need so much storage. Ditto for much of the other stuff.

luhn2y ago

Yeah, those numbers struck me as well. At 375 million items sold, that's about 0.5GB storage and 2GB transfer per item.

steveBK1232y ago

10+ years ago I worked on a trading system that was generating something like 1TB/day of messaging.

As we hit these levels we asked them - how many trades are we even doing on this system? The answer was something on the order of.. 50. Granted it was a bond system and the nationals are huge, but theres just no reason to store 20GB per trade.

These are the kinds of decisions that get made when one team is responsible for message generation and the other is responsible for the storage, lol.

We then had to work backwards with them to unwind a lot of the INFO level chatty messaging between what you'd now call "microservices" and reduce the volume by 90+%.

fbdab1032y ago

I suppose you need to know how many requests did not result in a purchase. Is it 1000 views:purchase? I have not checked in on a Prime Day sale for several years, but is there any timeliness component (Flash Sales?) where people would be incentivized to mash the reload button?

tomwheeler2y ago

Yes, but that's per item sold.

After looking at screen after screen of no-name garbage on Prime Day, I gave up. I suspect that there are tons of people like me. In other words, we only contributed to the numerator, not the denominator.

thenewarrakis2y ago

I think the EBS numbers are "double counting". Most of the other services in the list are using EBS under the hood, so I wouldn't be surprised if this number includes stuff like the Aurora instances, CloudTrail events, SQS events, etc that are also included.

Also, it specifically says "incremental capacity allocated", not necessarily used. Keep in mind that every EC2 instance launched also means new EBS storage is allocated. The article also estimates that 50 million EC2 instances were used for Prime Day. If you assume that half of these were newly created to support the surge of Prime Day, 25 million instances using up 160 PB of storage is only 6 gigabytes per instance, which definitely seems in the realm of possibility.

rqtwteye2y ago

It seems to me that a lot of modern architectures store the same data in multiple places. The systems I see proposed in my company probably need often 10 times more space than the actual data we have because they copy and cache a lot of stuff.

figassis2y ago

Microservices requires denormalizing data across tables and dbs. There’s a cost to how many microservices you build.

CamperBob22y ago

Hot take: Amazon's search UX is so terrible that it not only wastes near-endless amounts of customer time and patience, but their own bandwidth as well.

greatpostman2y ago

They’ve a/b tested it to death

2 more replies

thayne2y ago

A lot of that was certainly just for the root volumes of all those ec2 instances (how much exactly is hard to know without more details). Which of course would have duplicate copies of the various base images for the VMs.

Although, that does bring up the question of why AWS doesn't have a way to share a single read-only volume across multiple ec2 instances in the same availability zone. In many workloads there isn't any need to write to disk.

schlarpc2y ago

There kind of is, but it's not really made for that use case so there's a bunch of caveats (it's read/write, has a limited max number of attachments, io1/2 required, can't be the boot volume): https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volu...

kamikaz1k2y ago

Sometimes it’s just a bad decision that happens to “scaLe”. Like the print video thing.[1]

1. https://youtu.be/J7ITgYBn_3k

twoodfin2y ago

The EBS storage could easily be highly redundant (for good reason) local cache copies of store data.

pipingdog2y ago

Logging, metrics, distributed action trace.

rurp2y ago

> $102 million in infrastructure spend for an event that brought in over $12.7 billion in sales isn’t the worst return on investment that companies could make — by a landslide!

Well it's not amazing if your margin's are tiny, as they are in many industries (such as retail). Plus this was almost certainly architected by some of the foremost AWS experts in the world. It's verrrry easy to spend vastly more than was strictly necessary in AWS.

I don't mean to be too negative though, it was a really interesting article. Pretty wild to think about spending $100m on infrastructure over two days and still making a bunch of profit.

madrox2y ago

Important to remember that, before you could burst your infrastructure in the cloud, sites simply went offline in events like this. You took actively lost revenue in those cases.

ndriscoll2y ago

Or you could just design your architecture to not perform trillions of database requests for hundreds of millions of sales.

The listing data is almost static and should almost fit in RAM (the hot set probably does. Apparently Amazon has ~350M listings. A 24TB RAM server could give ~68kB/listing, and probably only a small fraction is hot). Since you'll need multiple servers anyway, you could shard on products and definitely fit things in RAM. 375 million sales even if condensed into 1 hour would only be 104k/second. A single db server should be able to handle the cart/checkout. Assuming ~10M page views/second, a couple racks of servers should be able to handle it.

The ad/tracking infrastructure surely can't account for the 1000x disparity in resource usage.

3 more replies

Uvix2y ago

Depending on the margins that could be preferable.

boulos2y ago

Yes, at 1% margin on those sales, that's more like $125M in revenue. It's important to remember that things like Prime Day are basically marketing that results in revenue outside the event.

dylan6042y ago

>It's important to remember that things like Prime Day are basically marketing

Be it Prime Day or Black Friday/Cyber Monday sales, I've seen the prices before the sale starts, and then once the sales start, it is the same price but with a slashed out higher MSRP type price. It's not any more of a sale during the sale than it was any of the other days.

Retric2y ago

Yea, actual profit was likely 100 - 400 million or so. As such spending 102 million on a single line item would be a serious question for most companies.

Of course Amazon is paying itself that premium so they have little incentive to care.

mrbonner2y ago

It’s not a surprise for me to hear that Amazon is still a heavy user of RDBMS all these years even after the so-called Rolling Stone project to get rid of Oracle DB in 2015. If Amazon can use RDBMS for their scale, I’m just furious when folks jumping up and down screaming in top of their lungs “Why do we use Postgres and not (insert some random NoSQL engine here)?” My response so far is calmly ask another question “Why not?” And let them try to find a justification to suite our scale requirements.

endisneigh2y ago

It’s fascinating that this is your conclusion from the article. Mine would be that if you can make it work and believe these estimates then dynamodb is clearly more cost effective. And given that every project inevitably settles in access patterns and thus is a perfect fit for something like dynamodb, why bother with rdbms as the hot path? Just use dynamo and stream to a columnar database for analytics once your product is “finished”.

bognition2y ago

It all depends on your workload, access patterns, and data model.

You can absolutely spend an arm and a leg making a system work using a RDBMS that would be simpler and cheaper using a NoSQL store. The opposite is also true.

When picking a database you should always consider the trade offs of the different technologies and weigh those against your goals and budgets.

Sometimes is okay to spend more for a system that is just simpler to manage and use. Sometimes it’s not.

orochimaaru2y ago

Your application use cases should dictate the database choice - eg consistency needed, access patterns, data normalization, reliability, etc.

benjaminwootton2y ago

The real cost would come in the months after whilst trying to decipher the bill adequately to track down everything you used and get it turned off. (Half a joke.)

I imagine there would be a ton of Lambda and the like in there too.

jayzalowitz2y ago

Corey is probably right, but id chunk an extra 10-20% of overprovisioning/undercounting on actual bill here and considering they OWN the fleet, they probably went out of there way to have disaster recovery ready to go in a bunch more contexts.

ckdarby2y ago

Even if AWS treats Amazon like any customer the article is off by a factor of 30-60%.

RIs for their RDS instances. Saving Plan for their EC2s.

1 or 3 year commit, no upfront vs all upfront, etc.

A customer at the size of Amazon using AWS would have private pricing arrangement and an EDP.

simpsond2y ago

You wouldn’t commit for 3 years for increased resources of a single day.

leetrout2y ago

You could and then sell the extra on the spot market for the other ~1000 days.

ckdarby2y ago

True, but all of the usage is not net new and they'll have a base commit.

jayzalowitz2y ago

Honestly, their EDP is probably effectively cost, set in stone to make sure that if the government breaks them up or something like that both systems are good.

jeffbee2y ago

The amount of mail alone is bonkers. If we assume that half of this traffic went to the big operators, Google and Microsoft, each of them would have observed a noticeable traffic bump, 10s of 1000s of requests per second on average all day. It is fun to think about how these systems are interconnected and how they affect each other.

infinitedata2y ago

Funny how folks here and from the article are fixated in comparing the $102M vs the $12.7b. They somehow forget there are product, advertising, warehouses, transportation, shipping, labor, operation and other labor cost involved. You didn’t spend $102 to earn $12,700…

RcouF1uZ4gsC2y ago

Sometime back IIRC, some hackers were upset about something Amazon did and tried to DDOS them.

When they realized their entire attack was just a fraction of what Amazon handled during the Holiday shopping season, they realized the futility and called it off.

mokarma2y ago

Naive question: What are they using EBS for? It seems unnecessary given all the Databases. Is that just local caching for EC2's?

cj2y ago

EBS is just a networked hard drive, so they could really be using it for anything storage related.

Is Amazon’s general architecture for their retail site publicly described anywhere?

OJFord2y ago

Well, without precluding other use, not even specifically caching but just disk for EC2 instances.

nonameiguess2y ago

At minimum, root volumes for the VMs. Theoretically, you could load immutable machine images from the network and run entirely off of in-memory filesystems if you persist nothing past instance shutdown (similar to how extremely cautious people might run Tails booted off USB on a laptop with no hard drive), but that won't actually save cost since memory is more expensive than disk anyway.

thayne2y ago

I don't think you can even technically do that in AWS. I don't think there is any way to detach the root volume from a running instance, or use an immutable network image to boot from. However, for many server workloads, operating entirely would be reasonable. Often you just need the operating system kernel and your server software, and maybe a monitoring agent. And all of that will be loaded in memory anyway.

yowlingcat2y ago

Well, not to answer your question with a question, but what would you imagine backs all of those database services? Or, said another way, I'm not sure Corey Quinn is mapping the cost dependency graph correctly by giving this breakdown as mutually exclusive (from the standpoint of AWS internally).

ripper11382y ago

It’s disk storage for EC2

gumby2y ago

> There’s the internal chargeback costs that AWS charges Amazon for services that would be subject to

Do they do this? I have asked some friends who are developers at AWS and both told me that they don't worry or even know what their usage costs. But that's just anecdote; perhaps their boss knows.

donavanm2y ago

I cant comment on individual teams or the business and accounting practices of Amazon.

I would ABSOLUTELY say that, at a minimum, every director or principal engineer needs to be familiar with costs and _should_ understand their P&L. Senior engineers and line managers probably/should have a passing familiarity or consideration. Individual random SDEs may not as its not their primary business function or deliverable and someone else is ultimately responsible.

Disclosure: Principal at AWS, opinions are my own.

Seanambers2y ago

Isn't the real clue here that the prices in the article are cost + margin. AMZN gets a steal.

ovao2y ago

And notably, AWS can, and likely does, allocate whatever unused or unpartitioned infra to themselves (or, more pedantically, to Amazon). A perpetual ‘savings plan’.

fragmede2y ago

Did any AWS customers experience unavailability during prime day, eg capacity issues launching instances, due to prime day taking precedence over other customers? If there are, they're under NDA so we'd never know.

other_herbert2y ago

You’ve talked me into running some load tests around and before these times… around thanksgiving I’ll give it a shot too… I wonder though if it’s just a redirection of traffic… if regular business sites are less busy because people are shopping it would just slightly shift the load from one “side” to the other

Hmmmm….

bashtoni2y ago

My expectation would be that Prime Day just causes AWS to get a little further ahead than normal with provisioning new infrastructure.

With AWS still growing they are constantly having to add hardware. Ahead of Prime Day, I presume they just bring forward new resources that their model otherwise says aren't needed for a few months.

Unavailability for other customers indicates either AWS growth has plateaued, they have hit the limit of throughput of how much hardware they can provision, or they just did their sums wrong.

vineyardmike2y ago

Amazon surely allocates their resources in advance of prime day, so they can preemptively change prices to account for demand or deny requests.

That said, why would capacity issues be behind NDA? Anyone can grab their API and attempt to allocate a VM (or 100k)

LazyMans2y ago

You can query the spot pricing api and see what’s going on with that. I have a feeling Amazon purposely tries not to hang their customers out to dry by consuming large amounts of spot instances, or on-demand tanking spot availability.

fragmede2y ago

Just a chilling effects from general paranoia over breaking NDA. What is and isn't actually covered by the NDA isn't something I had the time to look up for my comment for.

You can't spin up 100k instances on a virgin account, but it's an interesting idea!

sokoloff2y ago

I’m sure many customers have some form of mNDAs with AWS. I’d have to read ours to be sure, but I don’t think ours would preclude us talking about the problems we experienced on Prime Days, if there were any. (We saw none.)

spencerchubb2y ago

I am not saying that this is true AT ALL, but it would be kind of ironic if AWS slowed down competing ecommerce stores to try and get an advantage.

j / k navigate · click thread line to collapse

62 comments

version_five2y ago

  Amazon Prime Day event resulted in an incremental 163 petabytes of EBS storage capacity allocated – generating a peak of 15.35 trillion requests and 764 petabytes of data transfer per day.

luhn2y ago

Yeah, those numbers struck me as well. At 375 million items sold, that's about 0.5GB storage and 2GB transfer per item.

steveBK1232y ago

10+ years ago I worked on a trading system that was generating something like 1TB/day of messaging.

These are the kinds of decisions that get made when one team is responsible for message generation and the other is responsible for the storage, lol.

We then had to work backwards with them to unwind a lot of the INFO level chatty messaging between what you'd now call "microservices" and reduce the volume by 90+%.

fbdab1032y ago

tomwheeler2y ago

Yes, but that's per item sold.

thenewarrakis2y ago

rqtwteye2y ago

figassis2y ago

Microservices requires denormalizing data across tables and dbs. There’s a cost to how many microservices you build.

CamperBob22y ago

Hot take: Amazon's search UX is so terrible that it not only wastes near-endless amounts of customer time and patience, but their own bandwidth as well.

greatpostman2y ago

They’ve a/b tested it to death

2 more replies

thayne2y ago

schlarpc2y ago

kamikaz1k2y ago

Sometimes it’s just a bad decision that happens to “scaLe”. Like the print video thing.[1]

1. https://youtu.be/J7ITgYBn_3k

twoodfin2y ago

The EBS storage could easily be highly redundant (for good reason) local cache copies of store data.

pipingdog2y ago

Logging, metrics, distributed action trace.

rurp2y ago

> $102 million in infrastructure spend for an event that brought in over $12.7 billion in sales isn’t the worst return on investment that companies could make — by a landslide!

I don't mean to be too negative though, it was a really interesting article. Pretty wild to think about spending $100m on infrastructure over two days and still making a bunch of profit.

madrox2y ago

Important to remember that, before you could burst your infrastructure in the cloud, sites simply went offline in events like this. You took actively lost revenue in those cases.

ndriscoll2y ago

Or you could just design your architecture to not perform trillions of database requests for hundreds of millions of sales.

The ad/tracking infrastructure surely can't account for the 1000x disparity in resource usage.

3 more replies

Uvix2y ago

Depending on the margins that could be preferable.

boulos2y ago

Yes, at 1% margin on those sales, that's more like $125M in revenue. It's important to remember that things like Prime Day are basically marketing that results in revenue outside the event.

dylan6042y ago

>It's important to remember that things like Prime Day are basically marketing

Retric2y ago

Yea, actual profit was likely 100 - 400 million or so. As such spending 102 million on a single line item would be a serious question for most companies.

Of course Amazon is paying itself that premium so they have little incentive to care.

mrbonner2y ago

endisneigh2y ago

bognition2y ago

It all depends on your workload, access patterns, and data model.

You can absolutely spend an arm and a leg making a system work using a RDBMS that would be simpler and cheaper using a NoSQL store. The opposite is also true.

When picking a database you should always consider the trade offs of the different technologies and weigh those against your goals and budgets.

Sometimes is okay to spend more for a system that is just simpler to manage and use. Sometimes it’s not.

orochimaaru2y ago

Your application use cases should dictate the database choice - eg consistency needed, access patterns, data normalization, reliability, etc.

benjaminwootton2y ago

The real cost would come in the months after whilst trying to decipher the bill adequately to track down everything you used and get it turned off. (Half a joke.)

I imagine there would be a ton of Lambda and the like in there too.

jayzalowitz2y ago

ckdarby2y ago

Even if AWS treats Amazon like any customer the article is off by a factor of 30-60%.

RIs for their RDS instances. Saving Plan for their EC2s.

1 or 3 year commit, no upfront vs all upfront, etc.

A customer at the size of Amazon using AWS would have private pricing arrangement and an EDP.

simpsond2y ago

You wouldn’t commit for 3 years for increased resources of a single day.

leetrout2y ago

You could and then sell the extra on the spot market for the other ~1000 days.

ckdarby2y ago

True, but all of the usage is not net new and they'll have a base commit.

jayzalowitz2y ago

Honestly, their EDP is probably effectively cost, set in stone to make sure that if the government breaks them up or something like that both systems are good.

jeffbee2y ago

infinitedata2y ago

RcouF1uZ4gsC2y ago

Sometime back IIRC, some hackers were upset about something Amazon did and tried to DDOS them.

When they realized their entire attack was just a fraction of what Amazon handled during the Holiday shopping season, they realized the futility and called it off.

mokarma2y ago

Naive question: What are they using EBS for? It seems unnecessary given all the Databases. Is that just local caching for EC2's?

cj2y ago

EBS is just a networked hard drive, so they could really be using it for anything storage related.

Is Amazon’s general architecture for their retail site publicly described anywhere?

OJFord2y ago

Well, without precluding other use, not even specifically caching but just disk for EC2 instances.

nonameiguess2y ago

thayne2y ago

yowlingcat2y ago

ripper11382y ago

It’s disk storage for EC2

gumby2y ago

> There’s the internal chargeback costs that AWS charges Amazon for services that would be subject to

Do they do this? I have asked some friends who are developers at AWS and both told me that they don't worry or even know what their usage costs. But that's just anecdote; perhaps their boss knows.

donavanm2y ago

I cant comment on individual teams or the business and accounting practices of Amazon.

Disclosure: Principal at AWS, opinions are my own.

Seanambers2y ago

Isn't the real clue here that the prices in the article are cost + margin. AMZN gets a steal.

ovao2y ago

And notably, AWS can, and likely does, allocate whatever unused or unpartitioned infra to themselves (or, more pedantically, to Amazon). A perpetual ‘savings plan’.

fragmede2y ago

other_herbert2y ago

Hmmmm….

bashtoni2y ago

My expectation would be that Prime Day just causes AWS to get a little further ahead than normal with provisioning new infrastructure.

With AWS still growing they are constantly having to add hardware. Ahead of Prime Day, I presume they just bring forward new resources that their model otherwise says aren't needed for a few months.

Unavailability for other customers indicates either AWS growth has plateaued, they have hit the limit of throughput of how much hardware they can provision, or they just did their sums wrong.

vineyardmike2y ago

Amazon surely allocates their resources in advance of prime day, so they can preemptively change prices to account for demand or deny requests.

That said, why would capacity issues be behind NDA? Anyone can grab their API and attempt to allocate a VM (or 100k)

LazyMans2y ago

fragmede2y ago

Just a chilling effects from general paranoia over breaking NDA. What is and isn't actually covered by the NDA isn't something I had the time to look up for my comment for.

You can't spin up 100k instances on a virgin account, but it's an interesting idea!

sokoloff2y ago

spencerchubb2y ago

I am not saying that this is true AT ALL, but it would be kind of ironic if AWS slowed down competing ecommerce stores to try and get an advantage.

j / k navigate · click thread line to collapse