Slashing data transfer costs in AWS (opens in new tab)

(bitsand.cloud)

355 pointsdanielklnstein2y ago262 comments

262 comments

107 comments · 30 top-level

andrewstuart2y ago· 28 in thread

An alternative to sophisticated cloud cost minimization systems is…….. don’t use the cloud. Host it yourself. Or use Cloudflare which has 0 cents per gigabyte egress fees. Or just rent cloud servers from one of the many much cheaper VPS hosting services and don’t use all the expensive and complex cloud services all designed to lock you in and drain cash from your at 9 or 12 or 17 cents per gigabyte.

Seriously, if you’re at the point that you’re doing sophisticated analysis of cloud costs, consider dropping the cloud.

overstay89302y ago

If you're at the point you're doing sophisticated cloud cost analysis you are doing the cloud right, because that is completely impossible anywhere else.

I swear the people who say go on premise have no idea how much the salary costs of someone who will not treat their datacenter like a home lab is. Even Apple iCloud is in AWS and GCP because of how economical it is, you suck at the cloud you think you have to go back on prem, or you just don't give a shit about reliability (start pricing up DDoS protection costs at anything higher than 10G and tell me the cloud is more expensive).

We spend 100k+ on AWS bandwidth and it's still cheaper than our DIA circuits because we don't have to pay network engineers to manage 3 different AZs.

dzikimarian2y ago

Apparently we're doing the impossible for over 12 years now. Who knew?

Some people act like it's some kind of black magic. It's not. We've some customers in our DC and some on AWS for various reasons. AWS isn't less problematic. AWS is about 10x more expensive. Both on prem and cloud require people familiar with them and cloud-engineers are in no way cheaper.

Only meaningful problem is that on-prem requires some up front cost&time. That can be mitigated by leasing and other means, but indeed can be an issue for small businesses.

4 more replies

holoduke2y ago

My small business once spend 50k per month on AWS. We brought that back to 800 dollars for a similar setup at Hetzner. I find this a significant number.

3 more replies

snug2y ago

AWS and GCP are giving companies like Apple huge discounts so someone could say something like, "Even Apple iCloud is in AWS and GCP because of how economical it is"

There is too much nuance to say one is better than the other. In some cases using a IaaS is more economical, in other cases it's not.

For Apple, the same is also true[0] to say "Even Apple is running their own datacenters because of how economical it is"

0 - https://dgtlinfra.com/apple-data-center-locations/#:~:text=a....

1 more reply

qvrjuec2y ago

>someone who will not treat their datacenter like a home lab

What does this mean? They steal company resources for themselves, or just configure things incompetently?

2 more replies

tiffanyh2y ago

> on premise have no idea how much the salary costs of someone

I haven’t yet meet anyone at a company that heavily uses cloud and still doesn’t have the same number of salaried infra people as on-prem.

echelon2y ago

You act like these problems are especially hard.

Active-active, five nines, fault tolerance. Hard stuff. But managing on-prem is no harder.

This is what we're paid for.

4 more replies

maeln2y ago

> Seriously, if you’re at the point that you’re doing sophisticated analysis of cloud costs, consider dropping the cloud.

Which would mean that you loose part of the reason to use the cloud in the first place... A lot of org move to cloud based hosting because it enable them to go way further in FinOps / cost control (amongst many other thing).

This can make a lot of sense depending on your infra, if you have some fluctuation in you needs (storage, compute, etc...), cloud based solution can be a great fit.

At the end of the day, it is just a tool. I worked in places where I SSH´d into the prod bare-metal server to update our software, manage the firewall, check the storage, ... and all that manually. And I worked in places where we were using a cloud provider for most of our hosting needs. I also handle the transition from one to the other. All I can say is: It's a tool. "The cloud" is no better or worse than a bare-metal server or a VPS. It really depends on your use-case. You should just do your due diligence and evaluate the reason why one would fit you more than the other, and reevaluate it from time to time depending on the changes in your environment.

This whole "cloud bad" is just childish.

ownagefool2y ago

> A lot of org move to cloud based hosting because it enable them to go way further in FinOps / cost control

I think a lot of orgs move to cloud simply because it's popular and gartner told them so.

But taking a step away from that, it's really about self-service. When the alternative is logging a ticket for someone to manually misconfigure a VM and then fail to send you the login credentials, then your delivery is slow.

When you're chasing revenue, going slow means you're leaving money on the table. When you're a big bureaucratic org, it means your middle managers can't claim to have delivered a whole bunch of shit. Nobody likes being held up, but that's what infrastructure teams historically do.

4 more replies

andrewstuart2y ago

>> This whole "cloud bad" is just childish.

Not childish…. it’s a growing line of thought in the IT community that has bought the cloud sell unquestioningly for 20 years.

2 more replies

calvinmorrison2y ago

The cloud is better and worse than bare metal. It depends on the use case.

AWS is Kafkaesque though

1 more reply

whstl2y ago

There is a massive difference between "[if X happens] consider dropping the cloud" and "cloud bad".

Jean-Papoulos2y ago

>Seriously, if you’re at the point that you’re doing sophisticated analysis of cloud costs, consider dropping the cloud.

The blog post's solution is relatively simple to put in place ; if you're already locked-in to AWS, dropping it will cost quite a lot and this might be a great middle ground in some cases.

antonvs2y ago

> Host it yourself.

If you're actually using the features of cloud - i.e. managed services - then this involves building out an IT department with skills that many companies just don't have. And the cost of that will easily negate or exceed any savings, in many cases.

That's a big reason that the choice of cloud became such a no brainer for companies.

sgarland2y ago

Next you’ll tell me that full-stack is a lie, and devs don’t actually know how to run a DB.

2 more replies

crabbone2y ago

I don't think "host yourself" in this instance would've helped. I think AWS in this instance is operating at a loss. Author found a loophole in AWS pricing and that's why it's so cheap. Doing it on their own would've been more expensive.

Now as to why the AWS pricing the way it is... we may only guess, but likely to promote one service over the other.

amluto2y ago

AWS is surely operating at a loss for this particular case (zero S3 charge).

But there is no way that cross-AZ traffic costs AWS anywhere near what they charge for it.

belter2y ago

Those VPS hosting services are a solution for a startup but not to run your internet banking, airline company or public sas product. Plus their infinite egress bandwidth and data transfer claims are only true, as long as you don't...Use infinite egress bandwidth and data transfer....

api2y ago

There are also many bare metal and VPS providers that charge radically less for bandwidth or even charge by size of pipe rather than transfer.

Cloud bandwidth pricing is… well the best analogies are very inappropriate for this site.

cdchn2y ago

Which VPS services don't charge you at all for bandwidth?

dijit2y ago

you'd be harder pressed finding ones that do (before a reasonable limit).

tilaa.com

vultr.com

hetzner.com

linode.com

2 more replies

champtar2y ago

OVH (except for APAC data centers), I think Scaleway also has bandwidth included.

frankjr2y ago

https://www.netcup.eu/vserver/#root-server-details (with fair use policy)

mciancia2y ago

https://www.scaleway.com/en/

1 more reply

Turing_Machine2y ago

The tipping point for self-hosting would be the point at which paying for full-time, 24/7 on-call sysadmins (salary + benefits) is less than your cloud hosting bill.

That's just not gonna happen for a lot of services.

lijok2y ago

This makes no sense.

So what do I do if I'm at the point where I'm doing sophisticated analysis of on-prem costs? Do I move to the cloud?

barkingcat2y ago

also, if you really need to transfer hundreds of TB's of data from South Africa to the US, put it on a palette of disks and send it via bulk shipping.

then load the data into the datacentre and then just pay for the last sync / delta.

esafak2y ago

https://aws.amazon.com/snowcone/

develatio2y ago· 10 in thread

I'll share my trick :)

Lightsail instances can be used to "proxy" data from other AWS resources (eg EC2 instances or S3 buckets). Each Lightsail instance has a certain amount of data transfer included in it's price ($3.5 instance has 1TB, $5 instance has 2TB, $10 instance has 3TB, $20 instance has 4TB, $40 instance has 5TB). The best value (dollar per transferred data) is the $10 instance, which gives you 3TB of traffic.

Using the data provided by the post:

3TB worth of traffic from an EC2 would cost $276.48 (us-east-1). 3TB worth of traffic from a S3 bucket would cost $69.

Note: one downside of using Lightsail instances is that both ingress and egress traffic counts as "traffic".

jonatron2y ago

https://aws.amazon.com/service-terms/

> 51.3. You may not use Amazon Lightsail in a manner intended to avoid incurring data fees from other Services (e.g., proxying network traffic from Services to the public internet or other destinations or excessive data processing through load balancing or content delivery network (CDN) Services as described in the technical documentation), and if you do, we may throttle or suspend your data services or suspend your account.

develatio2y ago

A had a suspicion that this was against AWS's terms, but I never bothered to look if that was actually the case. Thank you for the heads up!

1 more reply

Hamuko2y ago

At least AWS is fully aware how premium their normal data transfer is and that one might want to optimise those costs.

1 more reply

mmh00002y ago

Yeah, but "service terms" are just recommendations that should often be ignored.

sangnoir2y ago

> You may not use Amazon Lightsail in a manner intended to avoid incurring data fees from other Services

This requires proving the users intent, which is not obvious except in the most blatant of cases (i.e. using Lightsail as a bent-pipe by writing the exact bytes you're reading). If it is a "CSV to Parquet translation layer", how would AWS possibly prove it's anything other than what it claims to be? You'd be paying a few more cents for compute, but that's the price of plausible deniability

2 more replies

rfoo2y ago

Here's another one:

You can download 1TB of data for free from AWS each month, as Cloudfront has a free tier [1] with 1TB monthly egress included. Point it to S3 or whatever HTTP server you want and voila.

[1] It used to be 50GB per month for the first 12 months. It was changed to 1TB free forever shortly after Cloudflare posted https://blog.cloudflare.com/aws-egregious-egress

overstay89302y ago

That "shortly" was 2 years, that Cloudflare post had nothing to do with it, Amazon barely considers them a competitor to begin with.

1 more reply

andruby2y ago

Nice!

Nitpick: $5 for 2TB is better than $10 for 3TB.

develatio2y ago

Ooohhh!! It is, indeed!

intelVISA2y ago

Nice trick but you are playing with fire due to the AWS' terms.

jakozaur2y ago· 7 in thread

S3 is a nice trick. More tricks:

1. Ask for discounts if you are a big AWS customer (e.g., spend $1mln+/year). At some point, they were huge for inter-AZ transfers.

2. Put things in one AZ. Running DB in "b" zone and your only server in "a" is even worse than just standardizing on one zone.

3. When using multiple AZ do load aware AZ balancing.

throwaway1672y ago

> Running DB in "b" zone and your only server in "a"

There must be use cases for this, but I lack imagination. Cost? But not cost?

jakozaur2y ago

Defaults. Either as a code or using click ops.

Many companies run servers without considering AZ. Then you can get the "best" of the worlds:

1. Your service is down if either of AZs gets hiccups.

2. You pay network charges and latency cost.

sokoloff2y ago

I can't see a reason to do this intentionally within a single account, but use cases with multiple accounts should be aware that what AZ has ID us-east-1a in Account 1 is not necessarily the same AZ that has the ID us-east-1a in Account 2.

https://docs.aws.amazon.com/ram/latest/userguide/working-wit...

QuadmasterXLII2y ago

Cost, unlimited cost- but no cost.

pibefision2y ago

4. Activate S3 Inteligent-Tier storage class?

danielklnsteinOP2y ago

This is great for saving on S3 storage costs!

But in the context of data transfer costs, this would actually increase the costs, because there's a small surcharge for Intelligent Tiering - and the only relevant storage class for sidestepping data transfer costs is standard storage (because it's the only one with free download), so Intelligent Tiering won't provide value.

endgame2y ago

You've got to be careful of the automation charge with Intelligent Tiering.

https://discourse.nixos.org/t/the-nixos-foundations-call-to-...

ishitatsuyuki2y ago· 5 in thread

GCP patched a similar loophole [1] in 2023 presumably because some of their customers were abusing it. I'd expect AWS to do the same if this becomes widespread enough.

[1]: https://cloud.google.com/storage/pricing-announce#network

rfoo2y ago

Unlikely. The "loophole" GCP patched was that you can use GCS to transfer data between regions on the same continent for free. This is already non-free on AWS. What OP mentioned is that transferring data between availability zones *in the same region* also costs $0.02 per GB and can be worked around.

cedws2y ago

I know of a way to get data out of GCP for free, although I haven't tried it in practice. Wonder if I could find a buyer for this info ;)

BonoboIO2y ago

I got one for Azure, where it would be nearly free to egress data from Azure to any other cloud provider or the internet.

It works, but i have no use case for it. 100TB egress to the internet costs about 7000$ ... i think, i could do it for 20$-50$.

greyface-2y ago

A guess: tunnel through 169.254.169.254 DNS server?

1 more reply

Cthulhu_2y ago

This doesn't feel like a loophole though, it feels like they have optimized S3 and intend your EC2 instances to use S3 as storage. But maybe not as transfer window, that is, they expect you to put and leave your data on there.

andersa2y ago· 5 in thread

I don't understand how AWS can keep ripping people off with these absurd data transfer fees, when there is Cloudflare R2 just right over there offering a 100 times better deal.

karlkatzke2y ago

Data has "gravity" -- as in, it holds you down to where your data is, and you have to spend money to move it just like you have to spend money to escape gravity.

perryizgr82y ago

When all my VMs and containers are hosted in AWS, and S3 has rock solid support no matter what language, framework, setup I use, it becomes really tough to ask the team to use another vendor for object storage. If something goes wrong with R2 (data loss, slow transfer, etc.) I will get blamed (or at least asked for help). If S3 loses data or performs slowly in some case, people will just figure we're somehow using it wrong. And they will figure out how to make it better. Nobody gets blamed. And to be honest, data transfer fees is negligible if your business is creating any sort of value. You don't need to optimise it.

tnolet2y ago

we just built a new feature for our pretty bandwidth heavy SaaS on R2. Works pretty damn good with indeed massive savings. We just use the AWS-SDK (Node.js) and use the R2 endpoint.

akira25012y ago

I trust cloudflare far less than AWS. Once my data is in AWS all applications in the same region as the data can use the data without paying anything in transfer costs.

Also, the prices he quotes are label prices, if you are a customer and you pre purchase your bandwidth under an agreement, it gets _significantly_ less expensive.

fabian2k2y ago

R2 is still pretty new. I don't know how well it works in practice in terms of performance and availability. And of course durability, which is difficult if not impossible to judge. S3 has a much longer history and track record, so it has the advantage here. And if all your stuff is inside AWS already there are advantages to keeping the data closer. Depending on how the data is used, egress might also not always be such a major cost.

But yes, the moment you actually produce significant amounts of egress traffic it gets absurdly expensive. And I would expect competitors like R2 to gain ground if they can provide reasonably competitive reliability and performance.

quickthrower22y ago· 5 in thread

This is a loophole. Hitting some loss leader at AWS, but if everyone only buys the $1 hotdog and nothing else then the $1 hotdog gets removed.

danielklnsteinOP2y ago

I'm not sure how this could be removed - the fundamentals behind it are basic building blocks of S3.

Maybe raising the cost of transient storage? e.g. If you have to pay for a minimum of a day's storage - but even if that was the case this would still be cost-effective, and at any rate it seems very unnatural for AWS to charge on such granularity.

+ I would guess that S3 is orders of magnitude more profitable for AWS than cross-AZ charges, so I'm not sure they'd consider it a loss-leader.

kevincox2y ago

It would be fairly easy to change the pricing policy. GCP did something similar for cross-region https://cloud.google.com/storage/pricing-announce#network. This is pretty severe because it seems to affect all reads. However I can imagine an alternate implementation where the source AZ is tracked when data is written and egress fees are charged when the data is read (as if the data was always stored in the source AZ). This could even be done more complexly such as only charging the first time data is read in another AZ. Once you read once it is free as-if it is now cached in that new AZ forever. Another option would just be raising the minimum storage duration so that it basically costs all or most of what the data transfer would.

It would definitely piss a lot of people off as it is adding to their bill, but it could likely be done in a way that makes exploiting this for just data transfer not worth it without adding huge costs to most "real" use cases.

1 more reply

api2y ago

It’s not a loss leader. Cloud bandwidth pricing is almost pure profit.

martinald2y ago

It's absolutely amazing that so many devs don't realise this. They seem to think that bandwidth should cost a few cents a month, when in reality it is virtually free. Perhaps the 7c/GB charge was reasonable when AWS came out 15 years ago, but networking has got orders of magnitude cheaper and faster in the intervening time period.

What's more odd now that 1gigabit+ home connections are available, it should be obvious to anyone doing the math that it can't cost that much, otherwise a 200GB CoD install would be costing the ISP $20.

4 more replies

quickthrower22y ago

Ok “loss” is a relative word here… a loss compared to what they could have got from you.

Some how AWS has to rip you off so if there is a non rip off gateway to the ripoff then if you can use the non rip off to avoid another rip off, they will close the “loophole”.

lucidguppy2y ago· 2 in thread

This feels like the tech equivalent of tax avoidance.

If too many people do this - AWS will "just close the loophole".

There's not one AWS - there are probably dozens if not hundreds of AWS - each with their own KPIs. One wants to reduce your spend - but not tell you how to really reduce your spend.

If you make something complex enough (AWS) - it will be impossible for customers to optimize in any one factor - as everything is complected together.

karlkatzke2y ago

This isn't a loophole. This is by design. AWS wants you to use specific services in specific ways, so they make it really cheap to do so. Using an endpoint for S3 is one of the ways they want you to use the S3 service.

Another example is using CloudFront. AWS wants you to use CloudFront, so they make CloudFront cheaper than other types of data egress.

lucidguppy2y ago

If they wanted you to behave in specific ways logically - wouldn't their documentation be less ambiguous?

https://www.lastweekinaws.com/blog/aws-cross-az-data-transfe...

jonatron2y ago· 2 in thread

If you're a heavy bandwidth user it's worth looking at Leaseweb, PhoenixNAP, Hetzner, OVH, and others who have ridiculously cheaper bandwidth pricing.

I remember a bizarre situation where the AWS sales guys wouldn't budge on bandwidth pricing even though the company wouldn't be viable at the standard prices.

declan_roberts2y ago

That’s very unusual, I think. Transfer cost seem to be something most people can negotiate.

jonatron2y ago

I hadn't really thought about it much, but googling it looks like there's a discount programme for a committed spend of around $1M/year. For a small company, that's a lot of money, and it was an unusually large amount of bandwidth for the size of company. I suppose it makes sense now I know they're interested in companies spending that sort of money.

1 more reply

mlhpdx2y ago· 2 in thread

I’ve been deploying 3xAZs in 3xRegions for a while now (years). The backing store being regional s3 buckets (keeping data in the local compliance region) and DDB with replication (opaque indexing and global data) and Lambda or Sfn for compute. So far data transfer hasn’t risen to the level of even tertiary concern (at scale). Perhaps because I don’t have video, docker or “AI” in play?

hipadev232y ago

I’m guessing either you don’t have much data or your infra is already so absurd that, yeah, the transfer costs are irrelevant by comparison.

mlhpdx2y ago

Not using VPCs (no need without instances/containers/RDS) mean most of the “absurd” costs go away. It’s cheap by any standard.

1 more reply

xbar2y ago· 2 in thread

After my account started getting bills this month for pennies for which there was no obvious accounting, I slashed my AWS costs by 100%.

I'm back to managing my own systems. So much cheaper and less chance of nonlinear bills.

danielklnsteinOP2y ago

In case you ever decide to return to AWS, its Cost Explorer is far from perfect but it can show you where your expenses are coming from, especially if your costs are pennies. In the last re:invent they even released daily granularity when grouping by resources (https://aws.amazon.com/blogs/aws-cloud-financial-management/...).

rospaya2y ago

Probably free tier expiration for some small change. With me it was AWS KMS.

nodeshift2y ago· 2 in thread

Someone in the thread said that if you're 'at the point that you’re doing sophisticated analysis of cloud costs, consider dropping the cloud.'

We've built https://nodeshift.com/ with the idea that cloud is affordable by default without any additional optimization, you focus on your app with no concerns on costs or anything else.

akira25012y ago

Cost analysis has helped me build great infrastructure on AWS. The costs are communicating to you what is and is not efficient for AWS to do on your behalf, by analyzing the costs and working to reduce them, you also incidentally increase efficiency and in some cases, such as this one, workload durability.

nodeshift2y ago

Cost analysis should of course play the foundation of everything you build, regardless if it's SaaS tooling or infrastructure. But surely it's easier to do a cost assessment and optimization exercise on something that is fundamentally more affordable than AWS and doesn't have as high of margin costs? That's why we have built a platform that creates all the value at a low cost.

lijok2y ago· 1 in thread

There are tons of these tricks you can use to cut costs and get resources for free. It's smart, but not reliable. It's the same type of hacking that leads to crypto mining on github actions via OSS repos.

Treat this as an interesting hacking exercise, but do not deploy a solution like this to production (or at least get your account managers blessing first), lest you risk waking up to a terminated AWS account.

huslage2y ago

I have used this and other techniques for years and never gotten shut down. Passing through S3 is also generally more efficient for distributing data to multiple sources than running some sync process.

arpinum2y ago· 1 in thread

Another trick is to use ECR. You can transfer 5TB out to the internet each month for free. The container image must be public, but you can encrypt the contents. Useful when storing media archives in Glacier.

declan_roberts2y ago

Sneaky idea! I love it!

vlovich1232y ago· 1 in thread

> it’s almost as if S3 doesn’t charge you anything for transient storage? This is very unlike AWS, and I’m not sure how to explain this. I suspected that maybe the S3 free tier was hiding away costs, but - again, shockingly - my S3 storage free tier was totally unaffected by the experiment, none of it was consumed (as opposed to the requests free tier, which was 100% consumed).

It’s also possible their billing system can’t detect transient storage usage. Request billing would work differently from how billed storage is tracked. It depends on how billing is implemented but would be my guess. That may change in the future.

adrianmonk2y ago

Maybe some sampling mechanism comes along and takes a snapshot once per hour.

Suppose you store the data there for 6 minutes. Then there's an 90% probability that the sampler misses it entirely and you pay $0. But there's a 10% probability that the sampler does catch it. Then you pay for a whole hour even though you used a fraction of that.

Over many events, it averages out close to actual usage[1]. In 9 out of 10 cases, you pay 0X actual usage. In 1 out of 10 cases, you pay 10X actual usage. (But you can't complain because you did agree to 1-hour increments.)

---

[1] Assuming no correlation between your timing and the sampler's timing. If you can evade the sampler by guessing when it runs and carefully timing your access, then you can save a few pennies at the risk of a ban.

TheNewsIsHere2y ago· 1 in thread

This may be arguably nitpicking, but the following statement from TFA isn’t exactly the case:

> Moreover, uploading to S3 - in any storage class - is also free!

Depending upon how much data you’re transferring in terms of storage class, number of API calls your software makes to do so, and the capacity used, you may incur charges. This is very easy to inadvertently do when uploading large volumes of archival data directly to the S3 Glacier tiers. You absolutely will pay if you end up using millions of API calls to upload tens of millions of objects comprising tens of terabytes or more.

danielklnsteinOP2y ago

Thanks for the feedback! I don't think it's nitpicking, you're right that it's misleadingly phrased - in fact, the only S3 costs I observed weren't storage at all, but rather the API calls.

I updated the phrasing.

dangoodmanUT2y ago· 1 in thread

I've not seen any evidence that multi-AZ is more resilient. There's no history of an entire AZ going down that doesn't affect the entire region, at least that I can find on the internet within 15 minutes of googling.

playingalong2y ago

Do you mean S3 or all services?

If all services, then things like whole or most of a single AZ being borked happens fairly often.

issafram2y ago· 1 in thread

I've been looking for a place to store files for backup. Already keeping a local copy on NAS, but I want another one to be remote. Would you guys recommend S3? Wouldn't be using any other services.

spieden2y ago

I use S3 with the DEEP_ARCHIVE storage class for disaster recovery. Costs go up if you have many thousands of files so careful there. Hopefully will never need to access the objects and it's the cheapest I could find.

esafak2y ago· 1 in thread

How do people economically run multi-region databases for low latency?

declan_roberts2y ago

Data transfer costs are extremely easy to negotiate with Amazon.

boiler_up8002y ago

S3 storage costs are charged per GB month so 1 TB * .023 per GB / 730 hrs per month… should be 3 cents if the data was left in the bucket for an hour.[1]

However sounds like it was deleted almost right away. In that case the charge might be 0.03 / 60 if the data was around for a minute. Normally I would expect AWS to round this up to $0.01..

The TimedByteStorage value from the cost and usage report would be the ultimate determinant here.

[1] https://handbook.vantage.sh/aws/services/s3-pricing/

glenngillen2y ago

This is clever. And as I understand it, one of the tricks WarpStream (https://www.warpstream.com) use to reduce the costs of operating a Kafka cluster.

playingalong2y ago

This trades costs for latency. Which is not a big deal for some use cases, but may be a real breaker for some of the others.

Havoc2y ago

It’s unfortunate that such shenanigans are even necessary

sebazzz2y ago

Offtopic but related: Has anyone noticed transient AWS routing issues as of late?

I’ve on three or four occasions the last three months notices that I got served a completely different SSL certificate than the domain I was visiting, of a domain that often could not be reached on publicly - probably pointing to some organizations internal OTA environment. In all occasions the URL I wanted to visit and the DNS of the site I was then actually visiting were located in AWS. Then less than a minute elapsed the issue is resolved.

I first thought it must be my side, my DNS server malfunctioning or something, but the served sites could not be accessed publicly anyway, and I had the issue on two separate networks with two separate clients and separate DNS servers. I’ve had it with polar.co internal environment, bank of ireland (iirc), multiple times with download.mozilla.org and a few other occassions.

I contacted AWS on Twitter about it, but just got some generic pointless response I should make an incident - but I’m just some random user, I’m not an AWS client. Somehow I could not get it clear to the AWS support on the other side of Twitter.

emmanueloga_2y ago

For those suggesting VPSs instead of cloud based solutions, how do you deal with high availability? Even for a small business you may need it to stay up at all times. With a VPS this is harder to accomplish.

Do you setup the same infrastructure in two or more VPS instances and then load balance? (say, [1]). Feels a bit of an ... artisanal solution, compared to using something like AWS ECS.

1: https://www.hetzner.com/cloud/load-balancer

gumballindie2y ago

I reduced them to 0 by not using AWS. This simple trick lets you install and configure dedicated servers that work just fine. Most of your auto scaling needs can be solved using a CDN. But by the time you reach such needs you'd have hired competent engineers to properly architect things - it will be cheaper than using amazon anyway.

rco87862y ago

There's going to a be a huge market for consultants to unwind people's cloud stacks and go back to simpler on-prem/colo (or Heroku-like) deployments in the coming years.

salawat2y ago

Or... Build your own cloud and transfer data to your hearts content for free (minus power).

DeathArrow2y ago

Can you do the same on Azure?

TruthWillHurt2y ago

True meaning of Trustless Environment.

explain2y ago

Paying for bandwidth is crazy.

j / k navigate · click thread line to collapse

262 comments

107 comments · 30 top-level

andrewstuart2y ago· 28 in thread

Seriously, if you’re at the point that you’re doing sophisticated analysis of cloud costs, consider dropping the cloud.

overstay89302y ago

If you're at the point you're doing sophisticated cloud cost analysis you are doing the cloud right, because that is completely impossible anywhere else.

We spend 100k+ on AWS bandwidth and it's still cheaper than our DIA circuits because we don't have to pay network engineers to manage 3 different AZs.

dzikimarian2y ago

Apparently we're doing the impossible for over 12 years now. Who knew?

Only meaningful problem is that on-prem requires some up front cost&time. That can be mitigated by leasing and other means, but indeed can be an issue for small businesses.

4 more replies

holoduke2y ago

My small business once spend 50k per month on AWS. We brought that back to 800 dollars for a similar setup at Hetzner. I find this a significant number.

3 more replies

snug2y ago

AWS and GCP are giving companies like Apple huge discounts so someone could say something like, "Even Apple iCloud is in AWS and GCP because of how economical it is"

There is too much nuance to say one is better than the other. In some cases using a IaaS is more economical, in other cases it's not.

For Apple, the same is also true[0] to say "Even Apple is running their own datacenters because of how economical it is"

0 - https://dgtlinfra.com/apple-data-center-locations/#:~:text=a....

1 more reply

qvrjuec2y ago

>someone who will not treat their datacenter like a home lab

What does this mean? They steal company resources for themselves, or just configure things incompetently?

2 more replies

tiffanyh2y ago

> on premise have no idea how much the salary costs of someone

I haven’t yet meet anyone at a company that heavily uses cloud and still doesn’t have the same number of salaried infra people as on-prem.

echelon2y ago

You act like these problems are especially hard.

Active-active, five nines, fault tolerance. Hard stuff. But managing on-prem is no harder.

This is what we're paid for.

4 more replies

maeln2y ago

> Seriously, if you’re at the point that you’re doing sophisticated analysis of cloud costs, consider dropping the cloud.

This can make a lot of sense depending on your infra, if you have some fluctuation in you needs (storage, compute, etc...), cloud based solution can be a great fit.

This whole "cloud bad" is just childish.

ownagefool2y ago

> A lot of org move to cloud based hosting because it enable them to go way further in FinOps / cost control

I think a lot of orgs move to cloud simply because it's popular and gartner told them so.

4 more replies

andrewstuart2y ago

>> This whole "cloud bad" is just childish.

Not childish…. it’s a growing line of thought in the IT community that has bought the cloud sell unquestioningly for 20 years.

2 more replies

calvinmorrison2y ago

The cloud is better and worse than bare metal. It depends on the use case.

AWS is Kafkaesque though

1 more reply

whstl2y ago

There is a massive difference between "[if X happens] consider dropping the cloud" and "cloud bad".

Jean-Papoulos2y ago

>Seriously, if you’re at the point that you’re doing sophisticated analysis of cloud costs, consider dropping the cloud.

The blog post's solution is relatively simple to put in place ; if you're already locked-in to AWS, dropping it will cost quite a lot and this might be a great middle ground in some cases.

antonvs2y ago

> Host it yourself.

That's a big reason that the choice of cloud became such a no brainer for companies.

sgarland2y ago

Next you’ll tell me that full-stack is a lie, and devs don’t actually know how to run a DB.

2 more replies

crabbone2y ago

Now as to why the AWS pricing the way it is... we may only guess, but likely to promote one service over the other.

amluto2y ago

AWS is surely operating at a loss for this particular case (zero S3 charge).

But there is no way that cross-AZ traffic costs AWS anywhere near what they charge for it.

belter2y ago

api2y ago

There are also many bare metal and VPS providers that charge radically less for bandwidth or even charge by size of pipe rather than transfer.

Cloud bandwidth pricing is… well the best analogies are very inappropriate for this site.

cdchn2y ago

Which VPS services don't charge you at all for bandwidth?

dijit2y ago

you'd be harder pressed finding ones that do (before a reasonable limit).

tilaa.com

vultr.com

hetzner.com

linode.com

2 more replies

champtar2y ago

OVH (except for APAC data centers), I think Scaleway also has bandwidth included.

frankjr2y ago

https://www.netcup.eu/vserver/#root-server-details (with fair use policy)

mciancia2y ago

https://www.scaleway.com/en/

1 more reply

Turing_Machine2y ago

The tipping point for self-hosting would be the point at which paying for full-time, 24/7 on-call sysadmins (salary + benefits) is less than your cloud hosting bill.

That's just not gonna happen for a lot of services.

lijok2y ago

This makes no sense.

So what do I do if I'm at the point where I'm doing sophisticated analysis of on-prem costs? Do I move to the cloud?

barkingcat2y ago

also, if you really need to transfer hundreds of TB's of data from South Africa to the US, put it on a palette of disks and send it via bulk shipping.

then load the data into the datacentre and then just pay for the last sync / delta.

esafak2y ago

https://aws.amazon.com/snowcone/

develatio2y ago· 10 in thread

I'll share my trick :)

Using the data provided by the post:

3TB worth of traffic from an EC2 would cost $276.48 (us-east-1). 3TB worth of traffic from a S3 bucket would cost $69.

Note: one downside of using Lightsail instances is that both ingress and egress traffic counts as "traffic".

jonatron2y ago

https://aws.amazon.com/service-terms/

develatio2y ago

A had a suspicion that this was against AWS's terms, but I never bothered to look if that was actually the case. Thank you for the heads up!

1 more reply

Hamuko2y ago

At least AWS is fully aware how premium their normal data transfer is and that one might want to optimise those costs.

1 more reply

mmh00002y ago

Yeah, but "service terms" are just recommendations that should often be ignored.

sangnoir2y ago

> You may not use Amazon Lightsail in a manner intended to avoid incurring data fees from other Services

2 more replies

rfoo2y ago

Here's another one:

You can download 1TB of data for free from AWS each month, as Cloudfront has a free tier [1] with 1TB monthly egress included. Point it to S3 or whatever HTTP server you want and voila.

[1] It used to be 50GB per month for the first 12 months. It was changed to 1TB free forever shortly after Cloudflare posted https://blog.cloudflare.com/aws-egregious-egress

overstay89302y ago

That "shortly" was 2 years, that Cloudflare post had nothing to do with it, Amazon barely considers them a competitor to begin with.

1 more reply

andruby2y ago

Nice!

Nitpick: $5 for 2TB is better than $10 for 3TB.

develatio2y ago

Ooohhh!! It is, indeed!

intelVISA2y ago

Nice trick but you are playing with fire due to the AWS' terms.

jakozaur2y ago· 7 in thread

S3 is a nice trick. More tricks:

1. Ask for discounts if you are a big AWS customer (e.g., spend $1mln+/year). At some point, they were huge for inter-AZ transfers.

2. Put things in one AZ. Running DB in "b" zone and your only server in "a" is even worse than just standardizing on one zone.

3. When using multiple AZ do load aware AZ balancing.

throwaway1672y ago

> Running DB in "b" zone and your only server in "a"

There must be use cases for this, but I lack imagination. Cost? But not cost?

jakozaur2y ago

Defaults. Either as a code or using click ops.

Many companies run servers without considering AZ. Then you can get the "best" of the worlds:

1. Your service is down if either of AZs gets hiccups.

2. You pay network charges and latency cost.

sokoloff2y ago

https://docs.aws.amazon.com/ram/latest/userguide/working-wit...

QuadmasterXLII2y ago

Cost, unlimited cost- but no cost.

pibefision2y ago

4. Activate S3 Inteligent-Tier storage class?

danielklnsteinOP2y ago

This is great for saving on S3 storage costs!

endgame2y ago

You've got to be careful of the automation charge with Intelligent Tiering.

https://discourse.nixos.org/t/the-nixos-foundations-call-to-...

ishitatsuyuki2y ago· 5 in thread

GCP patched a similar loophole [1] in 2023 presumably because some of their customers were abusing it. I'd expect AWS to do the same if this becomes widespread enough.

[1]: https://cloud.google.com/storage/pricing-announce#network

rfoo2y ago

cedws2y ago

I know of a way to get data out of GCP for free, although I haven't tried it in practice. Wonder if I could find a buyer for this info ;)

BonoboIO2y ago

I got one for Azure, where it would be nearly free to egress data from Azure to any other cloud provider or the internet.

It works, but i have no use case for it. 100TB egress to the internet costs about 7000$ ... i think, i could do it for 20$-50$.

greyface-2y ago

A guess: tunnel through 169.254.169.254 DNS server?

1 more reply

Cthulhu_2y ago

andersa2y ago· 5 in thread

I don't understand how AWS can keep ripping people off with these absurd data transfer fees, when there is Cloudflare R2 just right over there offering a 100 times better deal.

karlkatzke2y ago

Data has "gravity" -- as in, it holds you down to where your data is, and you have to spend money to move it just like you have to spend money to escape gravity.

perryizgr82y ago

tnolet2y ago

we just built a new feature for our pretty bandwidth heavy SaaS on R2. Works pretty damn good with indeed massive savings. We just use the AWS-SDK (Node.js) and use the R2 endpoint.

akira25012y ago

I trust cloudflare far less than AWS. Once my data is in AWS all applications in the same region as the data can use the data without paying anything in transfer costs.

Also, the prices he quotes are label prices, if you are a customer and you pre purchase your bandwidth under an agreement, it gets _significantly_ less expensive.

fabian2k2y ago

quickthrower22y ago· 5 in thread

This is a loophole. Hitting some loss leader at AWS, but if everyone only buys the $1 hotdog and nothing else then the $1 hotdog gets removed.

danielklnsteinOP2y ago

I'm not sure how this could be removed - the fundamentals behind it are basic building blocks of S3.

+ I would guess that S3 is orders of magnitude more profitable for AWS than cross-AZ charges, so I'm not sure they'd consider it a loss-leader.

kevincox2y ago

1 more reply

api2y ago

It’s not a loss leader. Cloud bandwidth pricing is almost pure profit.

martinald2y ago

4 more replies

quickthrower22y ago

Ok “loss” is a relative word here… a loss compared to what they could have got from you.

Some how AWS has to rip you off so if there is a non rip off gateway to the ripoff then if you can use the non rip off to avoid another rip off, they will close the “loophole”.

lucidguppy2y ago· 2 in thread

This feels like the tech equivalent of tax avoidance.

If too many people do this - AWS will "just close the loophole".

There's not one AWS - there are probably dozens if not hundreds of AWS - each with their own KPIs. One wants to reduce your spend - but not tell you how to really reduce your spend.

If you make something complex enough (AWS) - it will be impossible for customers to optimize in any one factor - as everything is complected together.

karlkatzke2y ago

Another example is using CloudFront. AWS wants you to use CloudFront, so they make CloudFront cheaper than other types of data egress.

lucidguppy2y ago

If they wanted you to behave in specific ways logically - wouldn't their documentation be less ambiguous?

https://www.lastweekinaws.com/blog/aws-cross-az-data-transfe...

jonatron2y ago· 2 in thread

If you're a heavy bandwidth user it's worth looking at Leaseweb, PhoenixNAP, Hetzner, OVH, and others who have ridiculously cheaper bandwidth pricing.

I remember a bizarre situation where the AWS sales guys wouldn't budge on bandwidth pricing even though the company wouldn't be viable at the standard prices.

declan_roberts2y ago

That’s very unusual, I think. Transfer cost seem to be something most people can negotiate.

jonatron2y ago

1 more reply

mlhpdx2y ago· 2 in thread

hipadev232y ago

I’m guessing either you don’t have much data or your infra is already so absurd that, yeah, the transfer costs are irrelevant by comparison.

mlhpdx2y ago

Not using VPCs (no need without instances/containers/RDS) mean most of the “absurd” costs go away. It’s cheap by any standard.

1 more reply

xbar2y ago· 2 in thread

After my account started getting bills this month for pennies for which there was no obvious accounting, I slashed my AWS costs by 100%.

I'm back to managing my own systems. So much cheaper and less chance of nonlinear bills.

danielklnsteinOP2y ago

rospaya2y ago

Probably free tier expiration for some small change. With me it was AWS KMS.

nodeshift2y ago· 2 in thread

Someone in the thread said that if you're 'at the point that you’re doing sophisticated analysis of cloud costs, consider dropping the cloud.'

We've built https://nodeshift.com/ with the idea that cloud is affordable by default without any additional optimization, you focus on your app with no concerns on costs or anything else.

akira25012y ago

nodeshift2y ago

lijok2y ago· 1 in thread

huslage2y ago

arpinum2y ago· 1 in thread

declan_roberts2y ago

Sneaky idea! I love it!

vlovich1232y ago· 1 in thread

adrianmonk2y ago

Maybe some sampling mechanism comes along and takes a snapshot once per hour.

---

TheNewsIsHere2y ago· 1 in thread

This may be arguably nitpicking, but the following statement from TFA isn’t exactly the case:

> Moreover, uploading to S3 - in any storage class - is also free!

danielklnsteinOP2y ago

Thanks for the feedback! I don't think it's nitpicking, you're right that it's misleadingly phrased - in fact, the only S3 costs I observed weren't storage at all, but rather the API calls.

I updated the phrasing.

dangoodmanUT2y ago· 1 in thread

playingalong2y ago

Do you mean S3 or all services?

If all services, then things like whole or most of a single AZ being borked happens fairly often.

issafram2y ago· 1 in thread

I've been looking for a place to store files for backup. Already keeping a local copy on NAS, but I want another one to be remote. Would you guys recommend S3? Wouldn't be using any other services.

spieden2y ago

esafak2y ago· 1 in thread

How do people economically run multi-region databases for low latency?

declan_roberts2y ago

Data transfer costs are extremely easy to negotiate with Amazon.

boiler_up8002y ago

S3 storage costs are charged per GB month so 1 TB * .023 per GB / 730 hrs per month… should be 3 cents if the data was left in the bucket for an hour.[1]

However sounds like it was deleted almost right away. In that case the charge might be 0.03 / 60 if the data was around for a minute. Normally I would expect AWS to round this up to $0.01..

The TimedByteStorage value from the cost and usage report would be the ultimate determinant here.

[1] https://handbook.vantage.sh/aws/services/s3-pricing/

glenngillen2y ago

This is clever. And as I understand it, one of the tricks WarpStream (https://www.warpstream.com) use to reduce the costs of operating a Kafka cluster.

playingalong2y ago

This trades costs for latency. Which is not a big deal for some use cases, but may be a real breaker for some of the others.

Havoc2y ago

It’s unfortunate that such shenanigans are even necessary

sebazzz2y ago

Offtopic but related: Has anyone noticed transient AWS routing issues as of late?

emmanueloga_2y ago

Do you setup the same infrastructure in two or more VPS instances and then load balance? (say, [1]). Feels a bit of an ... artisanal solution, compared to using something like AWS ECS.

1: https://www.hetzner.com/cloud/load-balancer

gumballindie2y ago

rco87862y ago

There's going to a be a huge market for consultants to unwind people's cloud stacks and go back to simpler on-prem/colo (or Heroku-like) deployments in the coming years.

salawat2y ago

Or... Build your own cloud and transfer data to your hearts content for free (minus power).

DeathArrow2y ago

Can you do the same on Azure?

TruthWillHurt2y ago

True meaning of Trustless Environment.

explain2y ago

Paying for bandwidth is crazy.

j / k navigate · click thread line to collapse