OVH CEO Octave Klaba speaking about the incident [video] (opens in new tab)

(ovh.com)

194 pointsmxschmitt5y ago140 comments

140 comments

Are there industry options or methods of wiring to allow for a UPS room separate from the actual rooms the racks are stored in?

It's almost tradition to have a rack with UPS's in the bottom and then the rest of the space filled with servers or drive arrays.

We wouldn't ever think of putting a tiny backup generator in the bottom of every rack, so why do we put a battery storage system there? Also, with the advances in battery chemistry technology that improve reliability and density, it's only a matter of time until Lithium chemistry batteries are available and that also increases the risk of fire.

Is there any reason not to move backup power to another room, or even to a separate structure like how they put backup generators on a pad outside of the building?

mikepurvis5y ago

For the extreme opposite of that, Google famously trolled everyone 10 years ago by announcing that every one of their servers had its own in-chassis 12V battery:

https://www.cnet.com/news/google-uncloaks-once-secret-server...

dilyevsky5y ago

It’s true! I wrote software that upgraded firmware on every one of those batteries without frying them up (most of the time). There was a public paper/talk on that few years back

jeffbee5y ago

Yes but Google also later moved to a 48VDC power architecture with the batteries at the bottom of the rack.

See http://apec.dev.itswebs.com/Portals/0/APEC%202017%20Files/Pl... page 6.

tyingq5y ago

That presentation is pretty interesting, including Google inventing their own "Switched Tank" DC-DC converters because the existing ones weren't efficient or reliable enough.

1 more reply

LinuxBender5y ago

And Facebook had small UPS/ATS units at the end of each row. Not sure if they still do that today it was like that when I walked through their datacenter. They did that for the purpose of power efficiency. They lost far less power by having many smaller units.

lclarkmichalek5y ago

What you saw likely wasn't a UPS, but an RPP

> A data center typically spans four rooms, called suites, where racks of servers are arranged in rows. Up to four MSBs provide power to each suite. In turn, each MSB supplies up to four 1.25 MW Switch Boards (SBs). From each SB, power is fed to the 190 KW Reactive Power Panels (RPPs) stationed at the end of each row of racks.

https://research.fb.com/wp-content/uploads/2016/11/dynamo_fa...

The UPS role is taken more by BBUs (Battery Backup Units). https://www.youtube.com/watch?v=KNsposM0sJE has lots of info about them.

(I work for FB, on entirely unrelated things)

1 more reply

WrtCdEvrydy5y ago

That was actually true, not an actual April Fool's day.

tinus_hn5y ago

It sounds rather wasteful to have one device generate mains voltage AC power, only to have another device transform it to lower voltage power and rectify it.

xvf225y ago

I mean chances are they are doing DC distribution.

https://datacenterfrontier.com/google-unveils-48-volt-data-c...

throw0101a5y ago

> Are there industry options or methods of wiring to allow for a UPS room separate from the actual rooms the racks are stored in?

Yes: longer cables.

See Figure 1 in the Schneider-APC white paper, where they have "Electrical Space", "Mechanical Space" (HVAC), and IT Space:

* https://download.schneider-electric.com/files?p_File_Name=VA...

Power is generated hundreds of kilometres from where it is used, so having your UPS room a few dozen metres from your actual DC room isn't a big deal. I-squared-R losses aren't going to be that huge.

Europe uses 400Y/230 for nominal low-voltage distribution (see Table 1 in above), so stringing some 400V extra copper to the PDUs, which then have 230V at the plugs, isn't a big deal.

Reason0775y ago

> "it's only a matter of time until Lithium chemistry batteries are available and that also increases the risk of fire."

Not all Li-ion chemistries are equal. In particular, the increasingly popular LiFePO4 (LFP) technology has much higher energy density, longer lifespan, improved environmental characteristics, and similar if not better safety characteristics compared to lead-acid.

(Besides sharing lead-acid's very low risk of fire, LiFePO4 also contains no corrosive acids which could damage and short equipment if a leak were to occur.)

branko_d5y ago

> lead-acid's very low risk of fire

Isn't lead-acid prone to releasing hydrogen gas?

When jump-starting a car, it is commonly recommended to connect the ground (black) cable to the chassis, not to the battery's black terminal, to avoid a spark igniting the hydrogen and causing an explosion.

tempay5y ago

The lead-acid batteries used in data centres are normally "sealed lead-acid batteries" and release significantly less hydrogen.

See: https://en.wikipedia.org/wiki/VRLA_battery

1 more reply

Nextgrid5y ago

Isn't hydrogen produced only while the battery is charging, or at worst, when discharging? If so, this still makes them much safer as they're otherwise inert while not in operation, compared to lithium-polymer which contains materials flammable at all times.

1 more reply

tyingq5y ago

There are plenty of datacenters with a separate battery room, sure.

sschueller5y ago

So far I have never been in one where this was not the case. Also they always have fire suppression.

hinkley5y ago

The farther your UPS is away from the server the fewer causes of power loss you can prevent.

I've seen people use UPSes to allow them to rearrange wiring. I've seen them fail by relying on the UPSes as well, of course.

If you wire the entire room with 2-3 separate electrical systems all powered off of separate remote UPSes, you can do whatever you want, but it's harder to change your mind or build out incrementally if you do.

doggodaddo785y ago

Batteries burn up, especially NMC Li ion.

Battery rooms were traditionally separate, used lead acid batteries, were surrounded by thicker walls, and equipped with FM200, just like main datacenter floors. They were typically placed near the transfer and PDU switchgear. I wouldn't put anything more flammable than LiFePO4 in a battery room, much less anywhere near a server.

It's people who decide to throw away conventions, common sense, and building codes because "they know better" who get into trouble.

I suspect this datacenter company could be sued into oblivion.

bzzzt5y ago

I've been led around a DC in the Netherlands where they used a separate room for the UPS batteries. Not only because of safety but because the reliability of small rack-mount UPSes really sucks (and you have to match them to the power draw of the servers in the rack or pay for a lot of unused capacity)

All racks and servers were connected to dual power feeds, so even when one of the feeds goes down the servers should run fine.

jve5y ago

We do exactly the same you describe here in Latvia.

walshemj5y ago

Depends for a DC / Telco your normally feeding 48v DC from the UPS which is normally separate.

The lack of fire suppression is also very worrying.

bcrl5y ago

s/UPS/batteries/. DC makes uninterruptible power much more simpler than an AC UPS.

walshemj5y ago

Oh yes I sort of skipped over that part :-)

bayindirh5y ago

Our current DC has generation and UPSes in different rooms, in isolated places from each other. Both are pretty far from the actual DC itself.

kazen445y ago

in most datacenters, the UPS's are stored in the same collum like structure as their generators. A Datacenter is divided into vertical columns, with eacht collum being powered by redundant power feeds, generator(s) and UPS's.

jacquesm5y ago

I love the lack of fluff, lack of political bs and the honesty on display here.

tyingq5y ago

Two interesting pictures from the earlier story...

How close the 3 data centers are in SBG: https://cdn.baxtel.com/data-center/ovh-strasbourg-campus/pho...

How hot that fire was. I'm pretty sure the orange spots are holes melted in the walls that are made from metal shipping containers: https://pbs.twimg.com/media/EwGqV17XMAMF_wa?format=jpg&name=...

tpmx5y ago

I think the key issue here is that there wasn't a functioning fire suppresssion system in place.

Second question: is such a system required for this kind of operation? Maybe?

bayindirh5y ago

If you're running an IT operation this big, you should have both fire detection units and oxygen replacing suppressants.

That should be mandatory. Otherwise, it'd be very hard to contain. Especially when all your servers have RAID controllers with Li-Ion batteries or supercapacitors or other extremely trigger-happy components.

Oh, and cooling systems. You're just kindling the fire with it at the beginning.

brmgb5y ago

> oxygen replacing suppressants

No, they are a major risk to the employees working there.

I would much rather have a data center destroyed by for every twenty years without victims than mandating the user of oxygen replacing fire suppressants.

mattbee5y ago

My old company's DC uses FM200 gas which puts fires out without asphyxiating anyone (I'm assuming the experience would be a respiratory workout for anyone caught in it, but it would have cost £20000 to test).

2 more replies

bayindirh5y ago

I understand the concerns you have, however I think there's a good middle ground. Would you consider the following procedure acceptable?

    - Isolate all rooms with fire-proof doors.
    - Keep fire supression system at manual.
    - When fire breaks try to contain (we have 24h watch).
    - If fails trigger fire supression system. It has 90 second delay and activated per room.
    - Leave premsises, make the calls.

Fire control and supressant control is not inside the system room. Also fire-proof doors seal the room reasonably well, so chemical doesn't move freely. Also there are better chemicals which break down faster and less harmful to everything.

BTW, We use Novec.

2 more replies

Mauricebranagh5y ago

These days those systems are locked out when people are working in there.

Not so much in the old days :-)

I worked at palace (based at Cranfield uni) that had an experimental rig that used freon as a working fluid - we had emergency respirator sets (like the fire service uses) and people trained in them to use to rescue anyone.

sschueller5y ago

I think certain ISO certifications require it.

stingraycharles5y ago

ISO27K doesn’t require it though, it only requires fire detection, not suppression.

raphaelj5y ago

I’m pretty confident they have efficient fire suppression systems.

They are hosting at least 400,000 servers. They for sure have multiple servers taking fire every single day, and yet it’s the first time it ends up in a catastrophic fire.

The fire suppression system either catastrophically failed, or something out of design happened with one of their inverters, as suggested in the video.

paulfurtado5y ago

Do servers really catch on fire nearly daily at this scale?

londons_explore5y ago

Yes. But normally it doesn't get outside the chassis of the machine.

Normally upon disassembly you'll find a few burnt components and a sooty mark the size of your fist.

chrisandchris5y ago

According to various sources (here on HN, also their official site) they only have fire detection and not suppresion [1].

[1] https://www.ovh.com/world/us/about-us/datacenters.xml

tormeh5y ago

The raw video link is https://www.ovh.com/fr/images/sbg/Octave-Klaba-speaking-en-v...

dang5y ago

That seems like a more precise URL so I've changed to that from https://www.ovh.com/fr/images/sbg/index-en.html for now. Thanks!

nickdothutton5y ago

Lucky it wasn’t a UPS explosion. https://h2tools.org/lessons/battery-room-explosion

bscphil5y ago

I realize it's probably just paranoia on my part but I am terrified of UPSs and can't bring myself to have one in my house. Gigantic batteries lying around in a flammable environment seems way too scary. My always-on server is read-only about 99% of the time, so I just put up with the outages when they happen (about 2 a year). If for some reason my OS eventually gets hosed because of this, I'll rebuild it.

Maybe one day if have a basement and some kind of concrete compartment to put the battery in I'll feel a bit better about it... but even then, not much you can do about gas leakage if that's a possibility with more recent UPSs.

Dylan168075y ago

A single lead-acid battery is putting out such a tiny amount of hydrogen. Put it in any room. It's a big fat nothing in a fire too: https://www.youtube.com/watch?v=ndfe0c00gwo Something could short but shorts are possible and dangerous even if no batteries are involved.

Or get a LiFePo battery and put it in a metal tub.

Macha5y ago

That looks like enough fire that spreading to my wooden floors or desks, carpets or skirting boards seems more likely than not?

1 more reply

Mauricebranagh5y ago

How do you feel about a Tesla in your Garage.

AYBABTME5y ago

Or a 40L tank full of gasoline in your garage?

1 more reply

riffic5y ago

There are a couple good bell system practice documents covering lead-acid battery hazards:

* http://etler.com/docs/bsp-archive/157/157-601-701_I19.pdf

* http://etler.com/docs/bsp-archive/157/157-601-101_I7.pdf

lovedswain5y ago

tl;dr UPS maintenance was performed by a vendor the day before the fire. Fire department used a thermal camera to isolate source of fire, it seemed to originate with 2 UPSes, one of which was the recently maintained UPS

mgbmtl5y ago

Other random bits: SBG-2 was an older generation datacenter, had ventilation issues? They have 4 other datacenters who have a similar design. Others, including SBG-3, have newer designs.

They're building 2500 servers per week.

For the offline buildings that are not destroyed, they have to rebuild the electrical distribution and network. It was not clear if they are also moving servers physically.

verytrivial5y ago

I read his description and literal hand waving as saying that for efficiency reasons, in 2011 they were sort of build like how you'd build a camp fire: convective airflow drawn through and out the top.

fanf25y ago

Right, this is the OVH “tower” data center design. Here’s a google maps link to the OVH Strasbourg site https://goo.gl/maps/rQjir8byKNwMDZrm9 with (from south to north) SBG1 (a collection of shipping containers), SBG2 (a figure 8 shape), and SBG3 (a box).

The idea of the tower design is to take ambient air in through the outside walls, through I think just one row of servers, then into a central void where convection pulls the air out.

1 more reply

jeffbee5y ago

This is taking to extremes my maxim that UPSes cause more outages than they prevent.

teh_klev5y ago

It's always the bloody UPS after "maintenance" :)

Back when I worked in hosting we'd get an email from this particular DC's NOC about either "UPS Maintenance" or generator testing. Our hearts would sink because, during one particular eighteen month period, there was a 50/50 chance our suite would go dark afterwards.

makkesk85y ago

If it turns out it's the upses that caught fire, one can't help but wonder if it would be a better idea to house the upses/backup power solutions in an adjacent smaller building outfitted with sprinklers perhaps?

yholio5y ago

The problem with water in the same place with high power equipment is that it instantly turns the room into a death trap for any personnel, now everything is potentially live.

Also, in the first part of a lithium battery fire, dropping water on them is quite explosive. It will eventually quench the fire but on the short run it will make it worse, filling the room with explosive hydrogen and poisonous lithium hydroxide. So when your water sprinklers engage over your UPS, you better be sure there's nobody around: https://www.youtube.com/watch?v=cTJh_bzI0QQ

throw0101a5y ago

> The problem with water in the same place with high power equipment […]

Most high-end UPSes have a relay where you can run an active-high or active-low emergency power off (EPO) signal. The EPO can either be a button that is pressed manually by the staff, automatically via fire suppression system, or both/either.

Schneider-APC white paper (PDF)

* https://download.schneider-electric.com/files?p_File_Name=AS...

The EPO can also cut-off the HVAC so oxygen is no longer fed into the area, and smoke isn't (re-)circulated.

In the US, this is probably covered in NFPA 75, "Standard for the Fire Protection of Information Technology Equipment":

* https://www.nfpa.org/codes-and-standards/all-codes-and-stand...

gmueckl5y ago

Such a switch doesn't make any remaining battery charge in the UPS go away magically. If the UPS housing gets breached (e.g. ingress of impure water or bending/melting from heat), you're back to square one.

1 more reply

walshemj5y ago

You would not be using water here

jacquesm5y ago

You would be using Halon injection in a normal DC, after sufficient warning to employees to gtfo if they hadn't done that already.

fanf25y ago

Not halon any more, but instead an ozone-friendly fire suppression gas such as argonite

2 more replies

ArchOversight5y ago

Halon is not really created anymore. If you have an halon fire suppression system the only way to get recharges is to get it recycled from other systems that are no longer using it.

It's fun when someone accidentally dumps the halon system because they pressed the wrong button, and now halon has to be sourced for replacement.

sschueller5y ago

You would want to use a non-water based fire suppression system.

gautamcgoel5y ago

What incident?

kevinmgranger5y ago

https://news.ycombinator.com/item?id=26407323

aidos5y ago

They had a fire.

11235813215y ago

...sale.

notJim5y ago

Not to be rude, but it's really wild to me that even after all this time during the pandemic, the CEO doesn't have a headset he can use so that the audio is intelligible. It's gotta be one of the highest ROI investments you can possibly make at this point.

tpmx5y ago

It's quite consistent with how the entire operation is run, at least when witnessed from the outside as a customer.

abluecloud5y ago

a functioning fire suppression system would have probably been better but second to that i guess a mic would be a good investment

tpmx5y ago

They had a DIY water sprinkler system in a wood-based structure:

https://lafibre.info/ovh-datacenter/ovh-et-la-protection-inc... (posted in previous threads)

Thaxll5y ago

So putting water on servers don't destroy them?

LinuxBender5y ago

This is typically a FM200 system in the datacenter [1] which is a pressurized gas. There are also often water "dry-pipes" in place that can pressurize if the FM200 fails. Fire suppression codes vary by country.

[1] - https://en.wikipedia.org/wiki/Automatic_fire_suppression

__turbobrew__5y ago

There are gas based fire suppression system such as FM-200 and Novec 1230.

If you aren't using gas based fire suppression systems it is basically an amateur operation. The small data centre at my work (10-ish racks) has a FM-200 fire suppression system, it really isn't that expensive to set up.

subssn215y ago

Most Fir suppression systems used in Electronics are Halon, CO2, or some other O2 replacing substance

sneak5y ago

Headset with mic, a neutral backdrop, a key light, a good camera, and a skincare routine have all paid for themselves 10x since this thing started, in my case.

bigyikes5y ago

This also frustrates me so much with all the newly-live-streamed events this year. So many companies are spending so much money putting together virtual conferences, but can’t be bothered to ship their speakers a decent mic or webcam. Heck, Apple’s $20 headphones would make a huge difference. Instead we get audio that sounds like it was recorded in my shower.

gruez5y ago

The audio was mostly fine, it was just randomly crapping out for a few seconds at a time.

iptrans5y ago

Can somebody add [video] to the title?

5h5y ago

[video - poor audio] maybe

edit: That comment was snide, my heart goes out to the OVH team, the message within the video was good, forthright & honest. I hope it will be well received by their customers - just a shame it's a bit difficult to listen to!

dang5y ago

Sure. Done.

FDSGSG5y ago

Isn't that already implied by the title?

dtx15y ago

Good to get out a response out quick, but this is too quick, the audio is garbage.

justicezyx5y ago

This would be one of inherient difference between smaller vs. giga players in cloud hosting.

AWS/Google/Azure, if this happens, there should only be limited outage to a small fraction of customers. As a matter of fact, Google had such an incident before, and literally no customers (internal and external) noticed.

ev15y ago

This is an apples to oranges comparison. OVH largely sells bare metal; their public cloud wasn't really impacted.

If you are using AWS, Google, or Azure, ran a single (or multiple machines) inside a single AZ with no backups and opted out of snapshots, you would face the exact same situation.

I can definitely say I see people complaining about how everything they have is down on AWS when us-east-1 goes down periodically, while large players that deploy sanely like Netflix fail over to another region seamlessly.

This [only owning a single machine at all] is what most of their customers whinging the most were doing. People that have actual sane production workloads on AWS or GCP are not going to be running 100% of their workload on a single EC2 instance with no backups.

People that are running on OVH are running often things like gameservers etc that monopolise 100% of a physical machine and don't support horizontal scaling. You quite literally cannot force a srcds/hlds server to "load balance" dynamically and fail over on heartbeat.

Often they are kids or students too, and the $30/m for a machine with 32-64GB ram is all they can afford (though this doesn't absolve them of paying $1-2/m more for offsite backups elsewhere)

You can provision more physical machines with the OVH API and have them be up in a different city in a minute or two. You get linespeed bandwidth between OVH DCs. It's up to you to use it.

jonas215y ago

On the other hand, just about every month, there's a story on HN saying why are you wasting your money on AWS when OVH is so much cheaper (for example https://news.ycombinator.com/item?id=24966028).

And well, I guess this is one of the reasons.

ev15y ago

If you choose to run 100% of your workload on a single EC2 VM in us-east-chaos-monkey and put nothing in S3, only local mounted block storage that also disappears when you reboot your on-demand EC2, that is on you.

1 more reply

dvfjsdhgfv5y ago

No, it doesn't work like this. I have several bare-metal (with Heztner, I use OVH for DNS), it's been over 10 years already. I know that if I only rent one machine in one location, I'm asking for trouble. Based on my experience, I would say that every 2-6 years something dies in a server. A disk, a controller, a fan, you name it. It's rare to have servers running for longer than 7 years without any issues, and they're outdated by that time anyway so they need to be migrated to a new machine.

So, as a bare minimum, you rent at least two different machines at two different locations for each project and make offsite backups. It's still way less expensive than AWS.

If I don't need a powerful server and just need to spin some instances for testing or small projects, I use Hetzner Cloud, it's ridiculously cheap.

kazen445y ago

how so? using OVH and their bare metal servers doesn't absolve you from doing your own due dilligence.

As said earlier, their cloud service is unaffected.

justicezyx5y ago

Oh good to know. I don't use OvH, and my limited understanding were from their products page which lists VM style offerings. I had assumed VMs were the major use cases on OvH.

ev15y ago

The pricing on their dedicated servers are cheaper than most 1-2GB VMs on cloud: https://www.ovhcloud.com/en/bare-metal/ - this is their flagship and most expensive brand, the cheaper ones are even less

The funniest tweets demanding their data and saying they'll lose everything are the people running:

https://twitter.com/Sensity_RP/status/1369496048998223873 - GTA5 multiplayer gameserver that begs for donations. Running on one of the cheaper sub-brands of OVH probably (soyoustart, for GAME ddos-protection). $30-40/machine.

https://twitter.com/pdfshift/status/1369550522479480833 - I can't tell if this is a troll

https://twitter.com/KatsanosAlex/status/1369501497348812801 - I can't tell if this is a troll

1 more reply

stefan_5y ago

This is a difference in what you are buying. When you are buying a dedicated server, there isn't exactly a good way to hide that the thing has just gone up in smokes.

When you buy a storage API, sure, failure rates go up, latency increases 100x, but after a few hours its probably back to normal.

Of course, with the increased abstraction, you get more problems. "Availability zones" are useless when most cloud outages are because of configuration or systemic issues that tend to bring the whole thing down, no matter which AZ you are. But apparently it's now considered "good enough" to just go "oh we are down because AWS is down".

tormeh5y ago

If you're in with the big cloud providers you have no choice. Hybrid cloud is economically impossible due to the bandwidth costs.

Yet somehow, at smaller providers and dedicated hosters bandwidth is usually included as a too-cheap-to-meter feature. Gotta love cloud innovation.

gowld5y ago

Bandwidth is cheap until you run out and then it's very expensive.

bauruine5y ago

AWS is the new IBM. Nobody ever got fired for using AWS.

greggyb5y ago

And funny enough, it's now considered crazy to do a hardware startup, even if staffed up with industry vets. The reaction to Oxide is funny to watch, especially.

Saris5y ago

It also depends if you're renting a dedicated server, vs cloud/VPS. AWS/Google/Azure deal with virtualized systems that can be moved around to another server easily.

OVH has a lot of dedicated servers as well though, so if you're using one of those then it can't be moved very easily to avoid downtime.

dilyevsky5y ago

Just to note - only google does live migration (with ~100ms blackout) with others you will take some downtime anyway (assuming the remaining zones even have enough capacity to fit everyone)

londons_explore5y ago

Google never migrates to a different physical location, so it wouldn't actually protect against a whole class of issues (major flooding, war, employee strike, etc)

1 more reply

jeffbee5y ago

I can't even find any press articles about the Google incident.

Nomikos5y ago

Are you googling? ;-)

londons_explore5y ago

At the time that Google data center was not known to be owned by Google. So it will just be "fire in industrial estate"

sergiotapia5y ago

Is it true that OVH has literally all data in that single datacenter?

anyfoo5y ago

Genuine question: How did that question come to be?

Knowing nothing about OVH, I just typed "ovh datacenters" into Google and the first hit was this: https://www.ovh.com/world/us/about-us/datacenters.xml with the first sentence being "27 data centers around the world, including 2 of the largest ones".

ChrisArchitect5y ago

weird '.xml' url.. wonder where that's linked from

list broken down a bit more on this page

https://us.ovhcloud.com/about/company/data-centers

batmansmk5y ago

Nope. 27 in the world, 10 in France, 1 caught fire, 2 others are stopped for inspection.

https://www.ovh.com/world/us/about-us/datacenters.xml

Nomikos5y ago

No, they have 26 left https://www.ovh.com/world/us/about-us/datacenters.xml

sergiotapia5y ago

What a stupid rumor thanks for clarifying

numpad05y ago

It is usually true that data literally on site is on site

FDSGSG5y ago

No. OVH has 27(-1) datacenters.

tyingq5y ago

I'm guessing the other two SBG datacenters took some kind of hit.

The buildings are very close together: https://cdn.baxtel.com/data-center/ovh-strasbourg-campus/pho...

And the fire looked really hot...like melting steel containers hot: https://pbs.twimg.com/media/EwGqV17XMAMF_wa?format=jpg&name=...

ArchOversight5y ago

SBG 1 had some rooms impacted, SBG 3 and 4 were not impacted at all.

However, they need to bring new power online, as well as bring a new network setup/fiber online to get network connectivity.

j / k navigate · click thread line to collapse

140 comments

deftnerd5y ago

Are there industry options or methods of wiring to allow for a UPS room separate from the actual rooms the racks are stored in?

It's almost tradition to have a rack with UPS's in the bottom and then the rest of the space filled with servers or drive arrays.

Is there any reason not to move backup power to another room, or even to a separate structure like how they put backup generators on a pad outside of the building?

mikepurvis5y ago

For the extreme opposite of that, Google famously trolled everyone 10 years ago by announcing that every one of their servers had its own in-chassis 12V battery:

https://www.cnet.com/news/google-uncloaks-once-secret-server...

dilyevsky5y ago

It’s true! I wrote software that upgraded firmware on every one of those batteries without frying them up (most of the time). There was a public paper/talk on that few years back

jeffbee5y ago

Yes but Google also later moved to a 48VDC power architecture with the batteries at the bottom of the rack.

See http://apec.dev.itswebs.com/Portals/0/APEC%202017%20Files/Pl... page 6.

tyingq5y ago

That presentation is pretty interesting, including Google inventing their own "Switched Tank" DC-DC converters because the existing ones weren't efficient or reliable enough.

1 more reply

LinuxBender5y ago

lclarkmichalek5y ago

What you saw likely wasn't a UPS, but an RPP

https://research.fb.com/wp-content/uploads/2016/11/dynamo_fa...

The UPS role is taken more by BBUs (Battery Backup Units). https://www.youtube.com/watch?v=KNsposM0sJE has lots of info about them.

(I work for FB, on entirely unrelated things)

1 more reply

WrtCdEvrydy5y ago

That was actually true, not an actual April Fool's day.

tinus_hn5y ago

It sounds rather wasteful to have one device generate mains voltage AC power, only to have another device transform it to lower voltage power and rectify it.

xvf225y ago

I mean chances are they are doing DC distribution.

https://datacenterfrontier.com/google-unveils-48-volt-data-c...

throw0101a5y ago

> Are there industry options or methods of wiring to allow for a UPS room separate from the actual rooms the racks are stored in?

Yes: longer cables.

See Figure 1 in the Schneider-APC white paper, where they have "Electrical Space", "Mechanical Space" (HVAC), and IT Space:

* https://download.schneider-electric.com/files?p_File_Name=VA...

Power is generated hundreds of kilometres from where it is used, so having your UPS room a few dozen metres from your actual DC room isn't a big deal. I-squared-R losses aren't going to be that huge.

Europe uses 400Y/230 for nominal low-voltage distribution (see Table 1 in above), so stringing some 400V extra copper to the PDUs, which then have 230V at the plugs, isn't a big deal.

Reason0775y ago

> "it's only a matter of time until Lithium chemistry batteries are available and that also increases the risk of fire."

(Besides sharing lead-acid's very low risk of fire, LiFePO4 also contains no corrosive acids which could damage and short equipment if a leak were to occur.)

branko_d5y ago

> lead-acid's very low risk of fire

Isn't lead-acid prone to releasing hydrogen gas?

tempay5y ago

The lead-acid batteries used in data centres are normally "sealed lead-acid batteries" and release significantly less hydrogen.

See: https://en.wikipedia.org/wiki/VRLA_battery

1 more reply

Nextgrid5y ago

1 more reply

tyingq5y ago

There are plenty of datacenters with a separate battery room, sure.

sschueller5y ago

So far I have never been in one where this was not the case. Also they always have fire suppression.

hinkley5y ago

The farther your UPS is away from the server the fewer causes of power loss you can prevent.

I've seen people use UPSes to allow them to rearrange wiring. I've seen them fail by relying on the UPSes as well, of course.

doggodaddo785y ago

Batteries burn up, especially NMC Li ion.

It's people who decide to throw away conventions, common sense, and building codes because "they know better" who get into trouble.

I suspect this datacenter company could be sued into oblivion.

bzzzt5y ago

All racks and servers were connected to dual power feeds, so even when one of the feeds goes down the servers should run fine.

jve5y ago

We do exactly the same you describe here in Latvia.

walshemj5y ago

Depends for a DC / Telco your normally feeding 48v DC from the UPS which is normally separate.

The lack of fire suppression is also very worrying.

bcrl5y ago

s/UPS/batteries/. DC makes uninterruptible power much more simpler than an AC UPS.

walshemj5y ago

Oh yes I sort of skipped over that part :-)

bayindirh5y ago

Our current DC has generation and UPSes in different rooms, in isolated places from each other. Both are pretty far from the actual DC itself.

kazen445y ago

jacquesm5y ago

I love the lack of fluff, lack of political bs and the honesty on display here.

tyingq5y ago

Two interesting pictures from the earlier story...

How close the 3 data centers are in SBG: https://cdn.baxtel.com/data-center/ovh-strasbourg-campus/pho...

How hot that fire was. I'm pretty sure the orange spots are holes melted in the walls that are made from metal shipping containers: https://pbs.twimg.com/media/EwGqV17XMAMF_wa?format=jpg&name=...

tpmx5y ago

I think the key issue here is that there wasn't a functioning fire suppresssion system in place.

Second question: is such a system required for this kind of operation? Maybe?

bayindirh5y ago

If you're running an IT operation this big, you should have both fire detection units and oxygen replacing suppressants.

Oh, and cooling systems. You're just kindling the fire with it at the beginning.

brmgb5y ago

> oxygen replacing suppressants

No, they are a major risk to the employees working there.

I would much rather have a data center destroyed by for every twenty years without victims than mandating the user of oxygen replacing fire suppressants.

mattbee5y ago

2 more replies

bayindirh5y ago

I understand the concerns you have, however I think there's a good middle ground. Would you consider the following procedure acceptable?

    - Isolate all rooms with fire-proof doors.
    - Keep fire supression system at manual.
    - When fire breaks try to contain (we have 24h watch).
    - If fails trigger fire supression system. It has 90 second delay and activated per room.
    - Leave premsises, make the calls.

BTW, We use Novec.

2 more replies

Mauricebranagh5y ago

These days those systems are locked out when people are working in there.

Not so much in the old days :-)

sschueller5y ago

I think certain ISO certifications require it.

stingraycharles5y ago

ISO27K doesn’t require it though, it only requires fire detection, not suppression.

raphaelj5y ago

I’m pretty confident they have efficient fire suppression systems.

They are hosting at least 400,000 servers. They for sure have multiple servers taking fire every single day, and yet it’s the first time it ends up in a catastrophic fire.

The fire suppression system either catastrophically failed, or something out of design happened with one of their inverters, as suggested in the video.

paulfurtado5y ago

Do servers really catch on fire nearly daily at this scale?

londons_explore5y ago

Yes. But normally it doesn't get outside the chassis of the machine.

Normally upon disassembly you'll find a few burnt components and a sooty mark the size of your fist.

chrisandchris5y ago

According to various sources (here on HN, also their official site) they only have fire detection and not suppresion [1].

[1] https://www.ovh.com/world/us/about-us/datacenters.xml

tormeh5y ago

The raw video link is https://www.ovh.com/fr/images/sbg/Octave-Klaba-speaking-en-v...

dang5y ago

That seems like a more precise URL so I've changed to that from https://www.ovh.com/fr/images/sbg/index-en.html for now. Thanks!

nickdothutton5y ago

Lucky it wasn’t a UPS explosion. https://h2tools.org/lessons/battery-room-explosion

bscphil5y ago

Dylan168075y ago

Or get a LiFePo battery and put it in a metal tub.

Macha5y ago

That looks like enough fire that spreading to my wooden floors or desks, carpets or skirting boards seems more likely than not?

1 more reply

Mauricebranagh5y ago

How do you feel about a Tesla in your Garage.

AYBABTME5y ago

Or a 40L tank full of gasoline in your garage?

1 more reply

riffic5y ago

There are a couple good bell system practice documents covering lead-acid battery hazards:

* http://etler.com/docs/bsp-archive/157/157-601-701_I19.pdf

* http://etler.com/docs/bsp-archive/157/157-601-101_I7.pdf

lovedswain5y ago

mgbmtl5y ago

Other random bits: SBG-2 was an older generation datacenter, had ventilation issues? They have 4 other datacenters who have a similar design. Others, including SBG-3, have newer designs.

They're building 2500 servers per week.

For the offline buildings that are not destroyed, they have to rebuild the electrical distribution and network. It was not clear if they are also moving servers physically.

verytrivial5y ago

fanf25y ago

The idea of the tower design is to take ambient air in through the outside walls, through I think just one row of servers, then into a central void where convection pulls the air out.

1 more reply

jeffbee5y ago

This is taking to extremes my maxim that UPSes cause more outages than they prevent.

teh_klev5y ago

It's always the bloody UPS after "maintenance" :)

makkesk85y ago

yholio5y ago

The problem with water in the same place with high power equipment is that it instantly turns the room into a death trap for any personnel, now everything is potentially live.

throw0101a5y ago

> The problem with water in the same place with high power equipment […]

Schneider-APC white paper (PDF)

* https://download.schneider-electric.com/files?p_File_Name=AS...

The EPO can also cut-off the HVAC so oxygen is no longer fed into the area, and smoke isn't (re-)circulated.

In the US, this is probably covered in NFPA 75, "Standard for the Fire Protection of Information Technology Equipment":

* https://www.nfpa.org/codes-and-standards/all-codes-and-stand...

gmueckl5y ago

1 more reply

walshemj5y ago

You would not be using water here

jacquesm5y ago

You would be using Halon injection in a normal DC, after sufficient warning to employees to gtfo if they hadn't done that already.

fanf25y ago

Not halon any more, but instead an ozone-friendly fire suppression gas such as argonite

2 more replies

ArchOversight5y ago

Halon is not really created anymore. If you have an halon fire suppression system the only way to get recharges is to get it recycled from other systems that are no longer using it.

It's fun when someone accidentally dumps the halon system because they pressed the wrong button, and now halon has to be sourced for replacement.

sschueller5y ago

You would want to use a non-water based fire suppression system.

gautamcgoel5y ago

What incident?

kevinmgranger5y ago

https://news.ycombinator.com/item?id=26407323

aidos5y ago

They had a fire.

11235813215y ago

...sale.

notJim5y ago

tpmx5y ago

It's quite consistent with how the entire operation is run, at least when witnessed from the outside as a customer.

abluecloud5y ago

a functioning fire suppression system would have probably been better but second to that i guess a mic would be a good investment

tpmx5y ago

They had a DIY water sprinkler system in a wood-based structure:

https://lafibre.info/ovh-datacenter/ovh-et-la-protection-inc... (posted in previous threads)

Thaxll5y ago

So putting water on servers don't destroy them?

LinuxBender5y ago

[1] - https://en.wikipedia.org/wiki/Automatic_fire_suppression

__turbobrew__5y ago

There are gas based fire suppression system such as FM-200 and Novec 1230.

subssn215y ago

Most Fir suppression systems used in Electronics are Halon, CO2, or some other O2 replacing substance

sneak5y ago

Headset with mic, a neutral backdrop, a key light, a good camera, and a skincare routine have all paid for themselves 10x since this thing started, in my case.

bigyikes5y ago

gruez5y ago

The audio was mostly fine, it was just randomly crapping out for a few seconds at a time.

iptrans5y ago

Can somebody add [video] to the title?

5h5y ago

[video - poor audio] maybe

dang5y ago

Sure. Done.

FDSGSG5y ago

Isn't that already implied by the title?

dtx15y ago

Good to get out a response out quick, but this is too quick, the audio is garbage.

justicezyx5y ago

This would be one of inherient difference between smaller vs. giga players in cloud hosting.

ev15y ago

This is an apples to oranges comparison. OVH largely sells bare metal; their public cloud wasn't really impacted.

If you are using AWS, Google, or Azure, ran a single (or multiple machines) inside a single AZ with no backups and opted out of snapshots, you would face the exact same situation.

Often they are kids or students too, and the $30/m for a machine with 32-64GB ram is all they can afford (though this doesn't absolve them of paying $1-2/m more for offsite backups elsewhere)

You can provision more physical machines with the OVH API and have them be up in a different city in a minute or two. You get linespeed bandwidth between OVH DCs. It's up to you to use it.

jonas215y ago

On the other hand, just about every month, there's a story on HN saying why are you wasting your money on AWS when OVH is so much cheaper (for example https://news.ycombinator.com/item?id=24966028).

And well, I guess this is one of the reasons.

ev15y ago

1 more reply

dvfjsdhgfv5y ago

So, as a bare minimum, you rent at least two different machines at two different locations for each project and make offsite backups. It's still way less expensive than AWS.

If I don't need a powerful server and just need to spin some instances for testing or small projects, I use Hetzner Cloud, it's ridiculously cheap.

kazen445y ago

how so? using OVH and their bare metal servers doesn't absolve you from doing your own due dilligence.

As said earlier, their cloud service is unaffected.

justicezyx5y ago

Oh good to know. I don't use OvH, and my limited understanding were from their products page which lists VM style offerings. I had assumed VMs were the major use cases on OvH.

ev15y ago

The funniest tweets demanding their data and saying they'll lose everything are the people running:

https://twitter.com/pdfshift/status/1369550522479480833 - I can't tell if this is a troll

https://twitter.com/KatsanosAlex/status/1369501497348812801 - I can't tell if this is a troll

1 more reply

stefan_5y ago

This is a difference in what you are buying. When you are buying a dedicated server, there isn't exactly a good way to hide that the thing has just gone up in smokes.

When you buy a storage API, sure, failure rates go up, latency increases 100x, but after a few hours its probably back to normal.

tormeh5y ago

If you're in with the big cloud providers you have no choice. Hybrid cloud is economically impossible due to the bandwidth costs.

Yet somehow, at smaller providers and dedicated hosters bandwidth is usually included as a too-cheap-to-meter feature. Gotta love cloud innovation.

gowld5y ago

Bandwidth is cheap until you run out and then it's very expensive.

bauruine5y ago

AWS is the new IBM. Nobody ever got fired for using AWS.

greggyb5y ago

And funny enough, it's now considered crazy to do a hardware startup, even if staffed up with industry vets. The reaction to Oxide is funny to watch, especially.

Saris5y ago

It also depends if you're renting a dedicated server, vs cloud/VPS. AWS/Google/Azure deal with virtualized systems that can be moved around to another server easily.

OVH has a lot of dedicated servers as well though, so if you're using one of those then it can't be moved very easily to avoid downtime.

dilyevsky5y ago

Just to note - only google does live migration (with ~100ms blackout) with others you will take some downtime anyway (assuming the remaining zones even have enough capacity to fit everyone)

londons_explore5y ago

Google never migrates to a different physical location, so it wouldn't actually protect against a whole class of issues (major flooding, war, employee strike, etc)

1 more reply

jeffbee5y ago

I can't even find any press articles about the Google incident.

Nomikos5y ago

Are you googling? ;-)

londons_explore5y ago

At the time that Google data center was not known to be owned by Google. So it will just be "fire in industrial estate"

sergiotapia5y ago

Is it true that OVH has literally all data in that single datacenter?

anyfoo5y ago

Genuine question: How did that question come to be?

ChrisArchitect5y ago