Websites and APIs on Render are unavailable due to Cloudflare network errors (opens in new tab)

(status.render.com)

95 pointsbgoldste2y ago46 comments

46 comments

39 comments · 9 top-level

anurag2y ago· 13 in thread

--- edit ---

Everything is back up. We're waiting for Cloudflare's RCA and will follow up with additional Render context right after.

------------

(Render CEO) While Cloudflare investigates the issue on their end, we're also working on ways to bypass Cloudflare.

Really sorry about this, folks. We'll keep https://status.render.com updated and will post an RCA once things calm down.

Cloudflare have declared an incident at https://www.cloudflarestatus.com/incidents/2xffnv666yd7.

In case you're wondering, we use Cloudflare to keep Render's network up during DDoS attacks. Both Render and our customers are often targeted. We've already started building a product that lets customers bypass Cloudflare altogether, and I expect we'll see more demand for it after today's incident.

jumploops2y ago

Thanks for the response anurag.

We really like Render but are running into issues with Cloudflare blocking requests that are incorrectly flagged as malicious (our service passes code blocks over HTTP, similar to Replit).

Not to mention our site was down for way too long this evening…

We’d consider staying if we can bypass Cloudflare altogether. Render has been stable otherwise.

supriyo-biswas2y ago

As a quick workaround, try base64 encoding your payloads.

1 more reply

anurag2y ago

Send me an email (address in profile) if you'd like alpha access to this feature.

1 more reply

reustle2y ago

> We've already started building a product that lets customers bypass Cloudflare altogether

Really happy to hear this. Thank you.

EspressoGPT2y ago

Thanks for the transparency here!

TekMol2y ago

Why are DDoS attacks a thing?

Who spends resources (money?) on running those? What is the incentive?

belter2y ago

They don't spend that many resources. Most participants in DDoS attacks are sometimes innocently recruited victims. Either victim of their own ignorance or victim of developers lack of care for secure defaults. In other words, some software product is deployed where it should not be...Then....The people and/or AI's who want to run these attacks, explore standard protocol behavior.

"Memcrashed - Major amplification attacks from UDP port 11211" - https://blog.cloudflare.com/memcrashed-major-amplification-a...

bombcar2y ago

DDoS attacks basically are still a thing because there's nobody really incentivized to solve it.

The people harmed by them are too small to fix it, and the people big enough make more money selling DDoS mitigation.

From what I understand you can avoid many DDoS just by going IPv6 only, because DDoS mainly depends on unpatched shitmachines from the old days.

1 more reply

maccard2y ago

> Who spends resources (money?) on running those?

A raspberry pi can generate enough traffic to overload an otherwise unprotected service. It doesn't cost much, if anything to launch a brute force attack.

There's been posts on here about malicious browser extensions, infected IOT devices, malware in mobile apps that give someone the means to launch an utterly brutal attack. Imagine if I had a service that could handle 10k rps. Now imagine 600k android devices from all across the world send one request per second each [0].

[0] https://www.trendmicro.com/vinfo/pl/security/news/mobile-saf...

senectus12y ago

typically done via hacked bot farms that cost the attacker nothing other than the fun of rolling out standardized scripted attacks on poorly configured servers.

Why they do it... well:

Competition suppression

Vindictive nastiness

Fun

Just because you can (the world is your sandbox)

Other reasons that might not occur to you but are very real for the attacker...

mousetree2y ago

We (fintech bank) were DDoS a few times and sent ransom emails

1 more reply

xwdv2y ago

HN regular conducts DDoS attacks on small weak websites.

cancan2y ago

(Happy Render customer) — Looking forward to the updates on this feature.

015a2y ago· 6 in thread

I'm going to share a, probably, controversial opinion. That opinion is: I can't stand an outage title like "Websites and APIs on Render are unavailable due to Cloudflare network errors". Its passing blame. I run an app or two on Render. I don't pay Cloudflare; I pay Render. Take responsibility for the infrastructure decisions that you make, for your customers; don't pass blame to your infrastructure providers.

anurag2y ago

We take full responsibility for the infrastructure choices that led to this outage. As the peer comment said, it's helpful to overshare in these situations.

We know developers don't actually care who's at fault and will move off of Render if we're down, period. Even before the incident, we'd started working on a project to eliminate the SPOF with Cloudflare, and now it's only a matter of time before we ship it.

015a2y ago

I get that, and the update is much appreciated. I don't mean to insinuate that this was the intention behind why that language was chosen; its just the sentiment that the language conveys, and that's why I'm not a fan of it.

The stance that I take is; its a fine line between Oversharing and Passing Blame in outages like this, and while I'm happy that a line like that when shared by Render means it was just oversharing (I love your product!), its easy to see how a line like that when shared by a less admirable company could be seen as "Nah man, its not on us, we didn't do anything wrong." A critical difference being; if Cloudflare was the cause, how are we working toward avoiding this cause in the future; which leads nicely to where pointing at Cloudflare (or any upstream provider) generally feels more agreeable; the retro.

To be clear; I have no intention of leaving Render, even if y'all weren't planning to alleviate this SPOF. I fully grok the difficult engineering required to nuke SPOFs like Cloudflare or AWS; and a bit of downtime here and there is a price I'm fine with paying.

1 more reply

true_religion2y ago

If Renders data center was down due to a city wide power outage I would still like to know because it’s the root cause.

I would still blame them for not having back up generators though.

However a failure to plan for emergencies is different from other kinds of failure.

015a2y ago

Sure, but there's a time and a place. Outages involve high tensions and fog of war; and you said it yourself, you're already ready to blame them for not having backup generators in this hypothetical example. The midst of an outage is not the time to start casting blame, on people, organizations, processes, providers, whatever. Outages are the time to fix; retros are the time to blame (within productive reason, of course).

pushdownandturn2y ago

If you ran it outside of Render, would you be using a CDN service or building your own?

The bigger issue you're alluding to is that of supply-chain reliability in SAAS products: when AWS goes down, multiple other (seemingly unrelated) services go down. But saying its the downstream service's fault is pointless, because if you were to do it yourself you'd be using the same upstream provider, and be dealing with their outage yourself.

In that example, Slack as a bigger of AWS would have a much bigger say, and a more direct line to AWS engineers, than you would.

015a2y ago

Right, and I think there's an interesting transitive correlation here: As a customer of Render, while Render was down because of Cloudflare; is it appropriate for me to post on our outage page: "Service interruption due to issues at Render"? "Service interruption due to issues at Cloudflare"? What does Cloudflare post on their page? (Well, they may actually post "due to a busted AC unit in our Seattle data center" which, you know, at that point we've hit bedrock so maybe that's valuable, but)

Its turtles all the way down, and in the midst of an outage I totally empathize with the off-the-cuff thinking that oversharing is better than undersharing, but after the fog of war clears you can even retro language like that and come to a different conclusion. What value do my customers, even if they're highly technical, gain by knowing its Render's fault that MyCoolService was down? Are they going to go open support tickets with Render? I'd bet Render very reasonably wouldn't appreciate that, and they're not going to have a better trunk to their support than I do.

jamil72y ago· 3 in thread

Kind of worrying that something like Cloudflare is so deeply baked into Render and customers don’t have a choice on whether or not they’re using it.

dbbk2y ago

Not really, Render is a managed stack, just like Heroku. If you want full control over the stack you can obviously do that. You pick Render so these choices are managed for you.

NicoJuicy2y ago

Picking technology to pick your products upon is not the customer's decision. It's the one who creates the product that the customers of Render wants.

It would be the same like requiring them to use Postgress instead of Ms Sql as a backend.

cpursley2y ago

I still find the trade off worth trying to roll out our own infrastructure (no, k8s isn’t “easy”). And it looks like they are already working on a more robust solution around this particular issue.

Sytten2y ago· 2 in thread

Resolved now, but an hour of downtime really shows you why you are paying for bigger cloud providers with an SLA and customer support. Honestly I wish we could have turned cloudflare off for the time of the issue vs having our api being down...

Maybe time to consider multiple CDN providers as an abstraction like you consider AWS/GCP as an abstraction.

anurag2y ago

> maybe time to consider multiple CDN providers

Yes. While Cloudflare is generally rock solid, we can't let this happen again.

boesboes2y ago

That is actually a selling point too. I've looked at both Fastly and Cloudflare a while back to replace the budget CDN at my previous job after an outage, but found both had more and more serious downtime in the 24m before that. So I just made a script to quickly switch between providers, but I'd rather not have to deal with it at all :-)

shash72y ago· 2 in thread

It seems to be an error on Cloudflare's site. Render seems to be using Cloudflare in some integral capacity which has turned it toast.

As a render customer, its affecting us too. Hope Cloudflare fixes this asap.

reustle2y ago

As a Render customer, I wish I had the option to not use Cloudflare in any way.

eyeownyde2y ago

Why is that? Are these issues common?

1 more reply

anaganisk2y ago· 2 in thread

Unrelated but the choice of background(white) and choice of color(white) of the title in the "hero" section of the website is poor on mobile. I assume the site UI wasn't tested on mobile? https://i.imgur.com/M1n6SYv.jpg

EspressoGPT2y ago

What – this text is and always has been black for me.

wongarsu2y ago

If the website doesn't set the font color, it might be black in light mode but white when the device/os/browser is in dark mode

gregsadetsky2y ago· 2 in thread

render.com down, our sites hosted there down as well... it sucks but it happens.

best of luck render & cloudflare teams!

EDIT: it's back! yay

tebbers2y ago

Hmmm, render.com is loading fine from London, UK for me.

gregsadetsky2y ago

it just came back online now - they were down when I checked

psnehanshu2y ago

Cloudflare just published a resolution 7 minutes ago.

nik7362y ago

So Render is single homed to Cloudflare?

j / k navigate · click thread line to collapse

46 comments

39 comments · 9 top-level

anurag2y ago· 13 in thread

--- edit ---

Everything is back up. We're waiting for Cloudflare's RCA and will follow up with additional Render context right after.

------------

(Render CEO) While Cloudflare investigates the issue on their end, we're also working on ways to bypass Cloudflare.

Really sorry about this, folks. We'll keep https://status.render.com updated and will post an RCA once things calm down.

Cloudflare have declared an incident at https://www.cloudflarestatus.com/incidents/2xffnv666yd7.

jumploops2y ago

Thanks for the response anurag.

We really like Render but are running into issues with Cloudflare blocking requests that are incorrectly flagged as malicious (our service passes code blocks over HTTP, similar to Replit).

Not to mention our site was down for way too long this evening…

We’d consider staying if we can bypass Cloudflare altogether. Render has been stable otherwise.

supriyo-biswas2y ago

As a quick workaround, try base64 encoding your payloads.

1 more reply

anurag2y ago

Send me an email (address in profile) if you'd like alpha access to this feature.

1 more reply

reustle2y ago

> We've already started building a product that lets customers bypass Cloudflare altogether

Really happy to hear this. Thank you.

EspressoGPT2y ago

Thanks for the transparency here!

TekMol2y ago

Why are DDoS attacks a thing?

Who spends resources (money?) on running those? What is the incentive?

belter2y ago

"Memcrashed - Major amplification attacks from UDP port 11211" - https://blog.cloudflare.com/memcrashed-major-amplification-a...

bombcar2y ago

DDoS attacks basically are still a thing because there's nobody really incentivized to solve it.

The people harmed by them are too small to fix it, and the people big enough make more money selling DDoS mitigation.

From what I understand you can avoid many DDoS just by going IPv6 only, because DDoS mainly depends on unpatched shitmachines from the old days.

1 more reply

maccard2y ago

> Who spends resources (money?) on running those?

A raspberry pi can generate enough traffic to overload an otherwise unprotected service. It doesn't cost much, if anything to launch a brute force attack.

[0] https://www.trendmicro.com/vinfo/pl/security/news/mobile-saf...

senectus12y ago

typically done via hacked bot farms that cost the attacker nothing other than the fun of rolling out standardized scripted attacks on poorly configured servers.

Why they do it... well:

Competition suppression

Vindictive nastiness

Fun

Just because you can (the world is your sandbox)

Other reasons that might not occur to you but are very real for the attacker...

mousetree2y ago

We (fintech bank) were DDoS a few times and sent ransom emails

1 more reply

xwdv2y ago

HN regular conducts DDoS attacks on small weak websites.

cancan2y ago

(Happy Render customer) — Looking forward to the updates on this feature.

015a2y ago· 6 in thread

anurag2y ago

We take full responsibility for the infrastructure choices that led to this outage. As the peer comment said, it's helpful to overshare in these situations.

015a2y ago

1 more reply

true_religion2y ago

If Renders data center was down due to a city wide power outage I would still like to know because it’s the root cause.

I would still blame them for not having back up generators though.

However a failure to plan for emergencies is different from other kinds of failure.

015a2y ago

pushdownandturn2y ago

If you ran it outside of Render, would you be using a CDN service or building your own?

In that example, Slack as a bigger of AWS would have a much bigger say, and a more direct line to AWS engineers, than you would.

015a2y ago

jamil72y ago· 3 in thread

Kind of worrying that something like Cloudflare is so deeply baked into Render and customers don’t have a choice on whether or not they’re using it.

dbbk2y ago

Not really, Render is a managed stack, just like Heroku. If you want full control over the stack you can obviously do that. You pick Render so these choices are managed for you.

NicoJuicy2y ago

Picking technology to pick your products upon is not the customer's decision. It's the one who creates the product that the customers of Render wants.

It would be the same like requiring them to use Postgress instead of Ms Sql as a backend.

cpursley2y ago

Sytten2y ago· 2 in thread

Maybe time to consider multiple CDN providers as an abstraction like you consider AWS/GCP as an abstraction.

anurag2y ago

> maybe time to consider multiple CDN providers

Yes. While Cloudflare is generally rock solid, we can't let this happen again.

boesboes2y ago

shash72y ago· 2 in thread

It seems to be an error on Cloudflare's site. Render seems to be using Cloudflare in some integral capacity which has turned it toast.

As a render customer, its affecting us too. Hope Cloudflare fixes this asap.

reustle2y ago