undefined | Better HN

0 pointsderefr8mo ago0 comments

You're misunderstanding the offering. (Maybe that's their fault for using intentionally misleading language... but using that language in this way is pretty common nowadays, so this is important to understand.)

For a postpaid service with usage-based billing, there are no separate "free" and "paid" plans (= what you're clearly thinking of when you're saying "tiers" here.)

The "free tier" of these services, is a set of per-usage-SKU monthly usage credit bonuses, that are set up in such a way that if you are using reasonable "just testing" amounts of resources, your bill for the month will be credited down to $0.

And yes, this does mean that even when you're paying for some AWS services, you're still benefitting from the "free tier" for any service whose usage isn't exceeding those free-tier limits. That's why it's a [per-SKU usage] tier, rather than a "plan."

If you're familiar with electricity providers telling you that you're about to hit a "step-up rate" for your electricity usage for the month — that's exactly the same type of usage tier system. Except theirs goes [cheap usage] -> [expensive usage], whereas IaaS providers' tiers go [free usage] -> [costed usage].

> Amazon should halt the application when it exceeds quota.

There is no easy way to do this in a distributed system (which is why IaaS services don't even try; and why their billing dashboards are always these weird detached things that surface billing only in monthly statements and coarse-grained charts, with no visibility into the raw usage numbers.)

There's a lot of inherent complexity of converting "usage" into "billable usage." It involves not just muxing usage credit-spend together, but also classifying spend from each system into a SKU [where the appropriate bucket for the same usage can change over time]; and then a lot of lookups into various control-plane systems to figure out whether any bounded or continuous discounts and credits should be applied to each SKU.

And that means that this conversion process can't happen in the services themselves. It needs to be a separate process pushed out to some specific billing system.

Usually, this means that the services that generate billable usage are just asynchronously pushing out "usage-credit spend events" into something like a log or message queue; and then a billing system is, asynchronously, sucking these up and crunching through them to emit/checkpoint "SKU billing events" against an invoice object tied to a billing account.

Due to all of the extra steps involved in this pipeline, the cumulative usage that an IaaS knows about for a given billing account (i.e. can fire a webhook when one of those billing events hits an MQ topic) might be something like 5 minutes out-of-date of the actual incoming usage-credit-spend.

Which means that, by the time any "trigger" to shut down your application because it exceeded a "quota" went through, your application would have already spent 5 minutes more of credits.

And again, for a large, heavily-loaded application — the kind these services are designed around — that extra five minutes of usage could correspond to millions of dollars of extra spend.

Which is, obviously, unacceptable from a customer perspective. No customer would accept a "quota system" that says you're in a free plan, yet charges you, because you accrued an extra 5 minutes of usage beyond the free plan's limits before the quota could "kick in."

But nor would the IaaS itself just be willing to eat that bill for the actual underlying costs of serving that extra 5 minutes of traffic, because that traffic could very well have an underlying cost of "millions of dollars."

So instead they just say "no, we won't implement a data-plane billable-usage-quota feature; if you want it, you can either implement it yourself [since your L7 app can observe its usage 'live' much better than our infra can] or, more idiomatically to our infra, you can ensure that any development project is configured with appropriate sandboxing + other protections to never get into a situation where any resource could exceed its the free-tier-credited usage in the first place."

0 comments

ipaddr8mo ago

Oracle can do it.

derefrOP8mo ago

Yes and no. Yes, if we're just specifically talking about the ability to support a free trial that will never bill you (i.e. what the OP was talking about); but no, if we're talking about the more-general ability to set spending limits and never be billed for overage (what this subthread drifted into discussing.)

Oracle Cloud has a 30-day free trial; and that free trial seems to have had some dedicated effort put into a whole divergent billing-infra path for it.

Under Oracle Cloud's free trial, you get a certain amount of spend ($300 in credits); and then, when your trial either expires (30 days) or you run that credit pool down to zero, your account is shut off.

Oracle do eat any marginal costs from your spend taking your credits "below zero" before they shut the account off, because your account was never billing to you anyway; it was billing to Oracle's marketing department as a lead-gen expense.

In other words, unlike Oracle Cloud's steady-state IaaS offering, their free-trial IaaS offering is actually a prepaid (but usage-billed) paradigm — with Oracle being the ones doing the pre-payment.

This works much like an oldschool prepaid phone plan, where you pay in every month to be given a certain number of [expiring/non-"rollover"] minutes/texts/MB of data; and then you get an itemized invoice at the end of the month for how close you came to "using up" each resource that month. And you very well can use up a resource's monthly paid allocation before the end of the month — e.g. "running out of texts" and being unable to send more, rather than those converting into something billed to you. (In a prepaid context, that "converting into being billed" is called "flex" or "pay-as-you-go" [PAYG] billing, and is usually some extra option you would have to enable, if offered at all.)

At scale, prepaid usage-billed systems are also asynchronous; to continue the telecom analogy, most phone-service providers won't re-aggregate your prepaid calling minutes to notice you've run out, until you hang up your current call. Only rarely do they have infra where the billing system can ping the telecom switches' control planes to say "hey, this guy just went over, hang up the call" — and when they do, they only do such checks on a 5-minute/30-minute interval, probably as a scheduled batch query.

But, yes, prepaid systems almost always do just eat any overage generated by this detection gap. This is usually safe, because prepaid systems are almost never elastic to the point that you could accrue nontrivial expenses during that short accounting gap.

When a system is that elastic, a systems architect responds by saying "this should be a postpaid system."

Which means that Oracle Cloud's free trial — insofar as it allows you to make use of truly-elastic resources with per-credit upstream basis costs, like FaaS compute — is probably vulnerable/exploitable. Oracle may sometimes be eating some hefty bills, where people on a free trial have wired their FaaS into a proxy fronting some already-highly-popular service.

This is mostly fine, if you have Oracle's treasury, because you'll still be doing KYC in advance of giving out these trials, so you'll only be letting any given individual do one trial.

But this does put Oracle in the territory of "having to think about people who buy burner identities on the black market [usually for ~$1] to sign up for services using them" + "having to think about people who sign up for their free trial and then sell that free-trial account's credentials on the black market [again, usually for ~$1]."

I haven't checked myself, but I would guess that like any other provider who sees this type of attack (e.g. Hetzner), Oracle Cloud likely has hardened registration flows that reject identities + cards from certain parts of the world; traffic fingerprinting heuristics that immediately shut down free trials if they start up a DDoS attack or the like; etc.

Which is something the other clouds get to skip thinking about entirely, by not having a true "free trial" with a prepaid model, and instead just offering e.g. a one-time $300 sign-up-bonus account credit.

---

But remember, we're only talking about the "free trial" here — something you only get access to for the first 30 days.

Oracle's free tier — the thing you have after the first 30 days — is no different than the one every other IaaS offers. It needs a billing account populated by your credit card; there's infrastructure to allow you to automate control-plane actions in response to billing thresholds being hit, but no offering that will wire anything up for you; etc.

In Oracle Cloud's free tier, you can set budget limits that will prevent new costed resources from being leased while your account is over that limit in a given month (which is certainly nice) — but those budget limits don't affect ongoing usage-based-billing of a resource. Your FaaS endpoints will continue to accrue vCPU-seconds of billed usage, until you — or some automation you wrote — shuts them off.

j / k navigate · click thread line to collapse