Medical Equipment Crashes During Heart Procedure Because of Antivirus Scan (opens in new tab)

(news.softpedia.com)

291 pointsakehrer10y ago204 comments

204 comments

86 comments · 25 top-level

Avernar10y ago· 16 in thread

"Merge says the antivirus froze access to crucial data acquired during the heart catheterization. Unable to access real-time data, the app crashed spectacularly.

The company claims that they included proper instructions in their documentation, advising companies to whitelist Merge Hemo's folders in order to prevent crashes from happening, so it seems that the whole incident was nothing more than an oversight on the medical unit's side."

Here's how I read that: The programmers of this piece of software assumed that some I/O operation would never fail and when it does the program shits itself. So instead of hardening their software to withstand loss of telemetry gracefully, which would cost time and money for the company, they just give instructions to disable scans on their folder.

Odds are good that somewhere this scan will happen (and it did). Either IT doesn't read the release notes or goofs the configuration or an antivirus update clears the white list. Might not even be the antivirus that interferes with the telemetry briefly.

But instead of having resilient software it's "the anitvirus software's fault" or "it's IT's fault" when something goes wrong because of their bad management/engineering decision.

sathackr10y ago

Exactly this. As I was reading the article I hoped to find this exact point in the HN comments.

The fault lies in the bad software. It could have been the indexing service, online defrag, automatic updates, or any of the other various background processes windows runs.

If it is critical software, it should be designed in a way to not fail when something non-critical malfunctions, and even the critical pieces should be built with redundancy.

scruple10y ago

I work for a medical devices company and I just want to say: We, specifically a few of us on the engineering staff, bring this sort of shit up constantly. I go hoarse having the same conversations over and over and over again about robustness in the face of failure, resiliency, redundancy, etc... The truth is that we're beholden to a board and an executive management team that, quite simply, doesn't give a fuck about our problems.

I'm not trying to excuse the company in the article or the company that I work for. And I do not work for the company in the article. I just wanted to point out that I do see how this can happen very easily and repeatedly.

4 more replies

brians10y ago

There are lots of faults. The software failed. The process that directed a helpdesk tech to install AV was a failure of some manager. The decision to engineer systems and networks in a way such that AV seemed like a good idea was a failure of an architect.

2 more replies

LeifCarrotson10y ago

I build software that does the exact same thing. We're running automotive tests, and our management/customers are unwilling to invest in solutions that will work in spite of the fact that Windows is not a real-time OS.

We use a National Instruments DAQ card, and need the PC to respond within 50 ms to issue new commands for hours or days. Remarkably, it usually (over hundreds of machines and decades of operation) does. When it doesn't, it's blamed on antivirus or firewall or technicians using the PC for other things while the software runs.

National Instruments provides real-time IO systems, but they cost a lot more than the basic systems. You can write driver-layer code that will run in real-time on Windows, but that takes longer.

Our customers and management, with varying levels of comprehension of the problem, elect to not spend that money. I hate to say it, but if we didn't make this compromise, there are competitors who would.

vardump10y ago

> We use a National Instruments DAQ card, and need the PC to respond within 50 ms to issue new commands for hours or days. Remarkably, it usually (over hundreds of machines and decades of operation) does. When it doesn't, it's blamed on antivirus or firewall or technicians using the PC for other things while the software runs.

It works as long as full code path and data it requires is not paged out. Or some other thread doesn't consume I/O resources, etc.

In other words, it's not guaranteed at all.

Only way to get Windows to react reliably within 50 ms is in a kernel driver, as response to an IRQ. There's considerable jitter even in IRQ, but usually worst case service times are 200-500 microseconds. Depends a lot on other devices and on your IRQ priority. It's worse for passive level drivers (IRQL == 0).

50 ms guaranteed response time requires the code and data is in non-paged pool.

sillysaurus310y ago

Software shouldn't necessarily try to account for errors in that manner. Usually, the most graceful thing to do is to exit cleanly.

For example, if there is a massive amount of data, it has to be stored on disk. It's too large to keep in memory. And if the point of the program is to transform that data in real time, then it has to have access to the disk.

The antivirus basically unplugged the disk. What can it do to recover? There's nothing to be done.

It should be able to survive that situation, of course. When the disk is plugged back in, it should be able to restart without any problems. But I think that's a different kind of resiliency than what you're referring to.

In this case, the only way to recover would be to copy the frozen data to a new area of the hard drive, assuming it retained read access. But such complexities result in brittle implementations, prone to acquiring bugs. What if the disk space runs out? So you check beforehand whether there's enough space. But what if some other program starts consuming disk space in the middle of your copy operation? And so on. It's an endless spiral of design complexity.

The situation in the article seems closer to hardware failure than a design oversight.

Avernar10y ago

> When the disk is plugged back in, it should be able to restart without any problems. But I think that's a different kind of resiliency than what you're referring to.

Yes and no. I was referring to restarting internally when the error condition went away but restarting the app and waiting for telemetry to return can be a valid solution.

Think of your torrent software. If you crank your firewall to block it while it's running it will not crash. If your disk fills up it won't crash. When the network comes back or more drive space if freed it will restart it's internal mechanisms. You wouldn't want it to restart in these conditions. If it runs out of memory however choosing to exit might be the best recovery mechanism.

I think a life critical medical application can at least strive for internal restart and do an external restart if all else failed. The article stated they had to reboot the machine to get it back. Now that's way worse.

> The situation in the article seems closer to hardware failure than a design oversight.

Hardware failure is almost always a permanent condition. This was a "my I/O stopped briefly and would have came back if my code could handle it".

1 more reply

wpietri10y ago

I totally disagree.

Sure, usually the most graceful thing to do is exit and hope a human fixes it. But that's usual because the usual condition is that sudden failure is NBD and a human is right there to screw with it.

That's becoming less common, though. When software was mostly something running on a PC doing some boring office task, reliability didn't matter. But as software is running our airplanes, our cars, our medical devices, and even, as with implanted pacemakers and insulin pumps, our bodies, then reliability gos from NBD to BFD.

We see the way forward with things Chaos Monkey [1] and crash-only software [2] and the sort of design for failure you see in things like Agent supervisor hierarchies [3], where the way to reliability is through designing for failure recovery from the beginning and testing thoroughly to make sure it really happens.

[1] https://github.com/Netflix/SimianArmy/wiki/Chaos-Monkey

[2] https://en.wikipedia.org/wiki/Crash-only_software

[3] http://doc.akka.io/docs/akka/snapshot/scala/fault-tolerance....

1 more reply

stevetrewick10y ago

Software should absolutely try to account for these kinds of errors and routinely does. When was the last time you saw a word processor or spreadsheet bork a machine so hard it needed to be restarted just because an AV scan kicked in? ReadFile and ReadFileEx both hand you a specific error if some part of the file you are trying to read is locked by another process because it's hardly rare. 'Halt and catch fire' is not generally considered a proportionate response. I've no idea where you're getting this 'unplugged the disk' thing from, AV software does not work this way.

ryandrake10y ago

You can't get enough up votes. Crashing because an I/O operation fails? That's sounds like simply a bug in the software. The developer didn't handle an error properly, and QA didn't test the software on an environment with elevated I/O activity. I've done enough code reviews over the years and seen enough ignoring errors from read(), ignoring malloc() returning null, not handling exceptions, etc. Good developers give a shit but many just don't care at all and think crashing or exiting when you're out of disk space is just fine.

mappu10y ago

Better to crash (and restart quickly into a known state) than to enter a rare, untested code path.

brians10y ago

For software critical to human life, test the rare code paths.

1 more reply

Avernar10y ago

Restarting quickly into a known state is not crashing. That is handling the error. How much of the program you restart is the question. Restart just a thread is better than restarting the whole program.

But their app crashed. And hard. It required a machine reboot to restart. While it returned the machine to a known state it wasn't quick.

And in medical software, all code paths need to be tested.

stevetrewick10y ago

File I/O error handlers should not be a rare untested code path.

Joof10y ago

Agreed. This is what happens when software fails catastrophically. People die and we lose confidence in software that could save lives if it worked properly.

dTal10y ago

You'd think it would take longer to write the instructions than it would to just throw a try-catch block in there.

11thEarlOfMar10y ago· 13 in thread

/rant/

I can't tell you how many times we've chased down field problems that ultimately were the result of antivirus scans. It's been so bad, that one of the first questions we now ask when we get a tool-down report is "is there antivirus running and what is the configuration?"

Bringing Windows into the architecture of any type of capital equipment control system is a bane. A scourge. I mean to say, it really is a misappropriation of software. Imagine, "Yeah, Frank only knows VB, so that's what we used for the aircraft's cockpit GUI."

/xrant/

raverbashing10y ago

This

This machines costs hundreds of thousands of dollars.

There should be no excuse for using Windows. None.

I would not be surprised if the "antivirus" thing was some PHB requirement

Amezarak10y ago

Is there a reason to believe that choosing Windows was a bad decision?

The bad decision was installing antivirus software. Otherwise, most any modern OS would be fine. This machine probably shouldn't be connected to a network (if it was), the USB ports should be disabled; data can come off on burned CDs, autorun should be disabled, etc. That's how you deal with IA concerns on a standalone mission-critical system, not by installing antivirus.

3 more replies

jcrawfordor10y ago

I think a lot of people are misreading this article. It appears that the device is standalone and runs some embedded OS, but as a feature logs data to the doctor's existing computer. So it's not so much that they built the tool on windows as it is that they built the tool on the operating system they expected doctor's to be using.

I still think this was a poor decision, but it's a different kind of poor decision

chapium10y ago

To be honest, there is an excuse, although not a very good one. The machine is costly because it is in high demand. R&D and pushing this through regulatory control are costly. This company has great incentives to cut costs to remain competitive.

1 more reply

ArkyBeagle10y ago

Windows ( at least 7 & XP) can be stripped down to something approximating a properly running system. This isn't easy.

Having an anything-critical Windows machine on the open Internet might not be such a wonderful idea.

copperx10y ago

I would expect these kinds of systems to be running a soft realtime OS. Or at the very least a run of the mill OS with no extraneous software running in the background.

Dwolb10y ago

This. How are these devices not running on some sort of hardened OS seen in airplanes and automotive? Medical applications are mission critical (or some variant) and should have same (or better!) certification procedures set up for correctness and security.

3 more replies

beachstartup10y ago

correlation doesn't imply causation.... except when someone selected windows, and then made 50 other moronic decisions also.

half the work done in this industry is just dealing with stupid decisions made by stupid people. i just accept this now.

tluyben210y ago

But when Windows is chosen because it is easier to find coders who will 'remain within the budget' (cheapcheap) and cannot even make sure the virus scanner doesn't run during procedures or at all it goes a bit too far.

2 more replies

dredmorbius10y ago

Only half?

I mean this quite seriously: you're hugely underestimating. Possibly unaware.

rsync10y ago

"Bringing Windows into the architecture of any type of capital equipment control system is a bane. A scourge."

We need to go much further than this, since many people will "solve" this problem by using a different platform than Windows.

In reality, bringing networking of any type into capital equipment control systems or critical infrastructure is the bane ... the scourge.

Whatever convenience or perceived function that networking (including very local networking, like USB) is dramatically outweighed by the additional attack surface.

Go back to sneakernet and check your facebook at home, Mr. Nuclear Plant Worker.

CaptSpify10y ago

The issue with this is: Many of these systems need to send data to other system on the network. You'll have to send the data over some kind of PACS network, or you'll just have to use a USB.

I agree that's a "better" setup than networking, but I still don't think having staff plug random usb devices into your medical equipment is a great idea either.

rwmj10y ago

And if you want PCI-DSS [credit card handling] certification, then you'd better be running AV software, even when it's completely inappropriate.

GunboatDiplomat10y ago· 7 in thread

Why on earth is medical equipment running standard Windows? This is the ideal location for some basic RTOS or even just an embedded Linux. Seems like a huge cost and risk for no gain.

tluyben210y ago

It's cheap to find Windows programmers and even cheaper to find ones that are not hindered by knowledge about software quality and safety. That's not their fault; no-one ever told them something like that exists.

nonbel10y ago

>"hindered by knowledge"

This is a great phrase. "Joe was hindered by knowledge of what a p-value means and so didn't claim he discovered a key to understanding the disease."

riyadparvez10y ago

That was my first thought too. And why there is an anti-virus running? This equipment should not be connected to the internet nor some staff should plug-in a flash drive on the first place.

ars10y ago

That's not true. They have to keep a record of the data for the patient file. So it does have to communicate remotely in some fashion.

1 more reply

devonkim10y ago

I believe that it's strongly related to the dashboard software being targeted towards "familiarity" by the doctors / customers and having old GUI-centric software from back in the mid-90s ported to run on Windows over the decades prettied up back when there was nothing really viable for end-user accessible embedded systems besides even more grotesquely expensive custom software with even less capabilities and far worse SDKs than anything under Windows back in, say, 1994. This isn't that different to me than banking software with a COBOL backend with Java middleware and a PHP frontend that's extremely common for retail banking sector.

Given the sheer amount of overhead in enterprise BS medical hardware has (read: sales people likely get the biggest chunk of the absorbed costs) it wouldn't be a surprise to me that the engineering teams amount to a skeleton crew while 70%+ of the personnel involved are non-engineers that override the professional decisions of the engineers.

technion10y ago

I feel there are two very contradictory views on HN. The first of these, is that anything safety critical needs to land on an RTOS. The second of these, is to avoid C.

Both of these have reasons behind them and appear to make sense. Leaders in the RTOS space appear to be QNX and vxworks. Suitable languages people raise are Rust and Go.

Based on some Google time, neither of these platforms support either of these OS's. Multiple "Introduction to vxworks" documents are all exclusively in C.

In terms of accessibility, safe languages are far easier for someone to get their hands on, test to death, than some of the OS's recommended.

kosmic_k10y ago

I never realized just how lucky I was to have been born when I was. I can't imagine building embed devices which aren't running a very simply super loop or an ARM RTOS.

steven201210y ago· 4 in thread

Antivirus scans are one of those things added on IT checklists to cover their ass whenever something wrong happens.

But it rarely is useful. It only causes problems. We've seen so many issues related to virus scans throughout the years it's crazy.

What's better is to lock down the servers with only minimal access. I haven't used virus scan on my main desktop for over 10 years because I don't click on weird emails and I don't go to sketchy websites ever. Sure there's the risk of malware from ads I suppose, but I'm not that worried.

jconley10y ago

Most of the time IT is just implementing policy from the CIO, which is basing it on the requirements of the company's insurers. Insurance companies require some very annoying things like Anti-virus. It's like having a lock on your office. You do it so the insurance company will pay you if someone comes in and steals your stuff.

AnthonyMouse10y ago

It's more like the requirements cronies put into defense contracts to make sure the contractors make a lot of money.

The reason "security requirements" documents require antivirus is that companies like Symantec make sure they're in the right position to be the ones asked when someone is writing up a security requirements document, so that their answer can be "make sure you install antivirus (and here's the contact info for our volume licensing center)."

jcrawfordor10y ago

Yeah, you don't click on weird emails and don't go to sketchy websites. Try managing IT security for an enterprise of 10,000 employees. A/V will save your ass hundreds of times every single day.

Computer professionals rarely understand the use case for A/V precisely because they are not the use case. In most all applications, A/V serves first as a safeguard against stupid user behavior, and only second as a safeguard against more advanced penetration (and in the latter case, one with only rare success). I'd bet that the #1 way enterprises are getting breached is still malicious email attachments, that's certainly true in my experience.

qb4510y ago

> I haven't used virus scan on my main desktop for over 10 years because I don't click on weird emails and I don't go to sketchy websites ever.

Haha, I used to be like that as a teen in Windows 9x era until one day I ran tcpdump on the router ;)

ezoe10y ago· 4 in thread

This situation is even funnier(and sadly very seriously flawed) in Japan.

Medical equipment require an authorization to use. Any change to the medical equipment requires another authorization or it's prohibited.

By "any change" , it includes Windows Update(it changes the system obviously).

The result: they use anti-malware software to protect(or rather, believed to protect) unpatched Windows.

At least one anti-malware software company(Trend Micro), marketing that their software can protect the medical equipment in such situation.

exhilaration10y ago

But... what about AV/malware definition updates? Doesn't that fall under "any change"?

symtos10y ago

and what about security updates to the snakeoil they sell, eg. https://bugs.chromium.org/p/project-zero/issues/detail?id=69...

Blackthorn10y ago

> Any change to the medical equipment requires another authorization or it's prohibited.

Honestly, this isn't a bad decision. If the device was tested and certified with specific software, a software upgrade is not guaranteed to not cause a problem.

symtos10y ago

using software with known problems in order to avoid potential problems from an upgrade does not seem like a non-bad decision

1 more reply

CaptSpify10y ago· 3 in thread

https://xkcd.com/463/

The whole structure is wrong. I used to work in medical equipment repair. Windows Embedded is running so many devices it's not funny. But it's not just Windows that's the problem.

I put a linux-system on a PACS network to diagnose equipment. It was a headless, and we asked the IT group to block it off from the Internet.

Hospital IT: "Does it have antivirus?"

Me: "..."

SixSigma10y ago

List of FDA medical equipment recalls for 2016

http://www.fda.gov/MedicalDevices/Safety/ListofRecalls/ucm48...

At least three of them are Class 1 - May cause death

And all of those are software related, none run Windows

http://www.fda.gov/MedicalDevices/Safety/ListofRecalls/ucm48...

kosmic_k10y ago

That is astoundingly horrifying, especially the Class 1's which were distributed for over five years.

2 more replies

ptaipale10y ago

Thank you; I hadn't spotted that XKCD quote about teacher and condom, but it will be very appropriate the next time someone asks me about the antivirus in the server software we do (on Linux).

dchichkov10y ago· 3 in thread

Let me surprise you, with the code quality that sometimes is running in what is actually 'life-critical' software.

Back in the nineties, I wrote a nice piece of some 300kb of C code, for DOS/x86. It was a complete software package, controlling medical equipment that was testing speed of blood coagulation. These tests are crucial in the patient post-operation recovery.

This piece of C code had some hardware control code, some statistics, a bit of math, some visualisation, GUI, etc. Normally, you'd imagine a team of 2-3 people, carefully written test cases, dedicated QA person, and a year of time to write something like it. And independend lab, that would certify the thing. Well... in that case, yes, there was independent certification... but...

It was just one developer, and I was 13, when I wrote it ;) During after-school time, in around 4-6 months. And I must say, I still sometimes have chills, when I think of the code quality, and, um, unorthodox solutions of 13-year-old myself. Yes, I've had some years of experience at the time, both writing software and designing hardware, and advice from my parents, who both could write software. But, at the time, I've had zero formal training, aside from reading K&R and PC XT manuals ;). So, you might imagine the code quality ;) Even, no need to imagine, I actually still have it somewhere in the archives :)

Keyframe10y ago

Quite a story! I wrote a program for a dentist office (not my dentist) when I was 13 and got paid! It was a program to track their patients and stuff though, nothing critical. What was peculiar about this was that the dentist asked me if I can write the software in Turbo Pascal. That's the only language he, kind of, understood and he wanted to maintain it later when I write it. I didn't know Pascal, I only knew C at the time, but I accepted the challenge and wrote my first and only program in TP. It was kind of elaborate, especially for a 13 year-old, but also fun (BGI!).

state10y ago

I understand why you probably don't want to put it up, but boy would it be fun to look at the code you're describing.

dchichkov10y ago

I probably will put it up. It's a nice inspirational story for teens out here. Doubt there'd be any repercussions, no one cares about some random code on GitHub. And the equipment is hopefully taken out of service years ago, it was more than 20 years back. I wish I knew how long it had been used, but there've been only about 10-20 units sold, I think.

I vaguely remember adding extra features for a year or so (like adding support for HP laserjet printer). But one of the founders of the company (on the business side) had some health problems, and I guess that had played role in very small number of units sold. The only feedback that I've had, is pretty much that my father took me to a lab once, that had a unit deployed, for a support call. And I've seen some real printouts with patient names, from the unit. The lab assistant seemed to be happy with the device. I remember them showing me some blood plasma and teaching me to count cells, during lab tour ;)

pdkl9510y ago· 3 in thread

Is it going to take more deaths to convince people to learn from the Therac-25[1]? If you aren't designing for safety first, you have no business working on medical devices or anything else that might be a dangerous when it misbehaves.

[1] http://sunnyday.mit.edu/papers/therac.pdf

chestervonwinch10y ago

I am not the parent poster, but may I ask why is this comment being down-voted? I'm not speaking for the parent, but he or she seems to be implying that medical equipment with anti-virus software with automatic updates (used as such) may potentially compromise a patient's safety, and may be indicative of further bad design practices, which could result in, at worst, death. Is this somehow off-topic, or not worthy of discussion?

pdkl9510y ago

That's exactly right. The article mentions that the doctors were fortunate enough to have five minutes during which they could reboot the device. If they were in the middle of some other procedure that had tighter time constraints, a reboot could have easily killed the patient.

Just like the Therac-25, this isn't about a single problem (the antivirus or the race condition in the Therac-25's software). Designing for safety has to happen at all levels of design. Using Windows (or Linux, or any other complex OS) in a medical device shows that the designer wasn't even considering the safety of major parts of their design.

Designing medical devices with an OS that can be infected with malware (and thus need an antivirus) is the same kind of idiocy that puts a car's steering and brakes on the same CAN bus as the music player and emergency radio. It's a sign that the designer needs either more education or a different job before someone is injured or killed.

jschwartzi10y ago

Because it's really disingenuous to say that the medical device industry hasn't learned anything from Therac 25. The concept of two-fault failures is an industry standard that was learned from Therac.

The fact is that in the Medical software industry the best practice is to manage the entire software configuration of the medical device. Failing to do so, and especially failing to adhere to the guidelines of the manufacturer, is negligent at best. We all know that the behavior that led to the hazard is the wrong thing to do and that somebody screwed up.

The only other real insight that can be gained from this incident is that it's very important to have configuration management procedures that are easy to follow, and it's important to verify that they were correctly followed. I can't tell whether they were in this case, but I suspect given the use of Off-the-shelf software that there was some manual sequence of steps required to adhere to the approved configuration. Given that, I would have expected an error of this magnitude, because it's well known that humans make mistakes whenever they are made to follow a sequence of steps. The configuration should have been verified at installation time, at least.

If you're interested in the kinds of things the industry has to consider in the US, take a look at the FDA guidance for the 510k submittal process.

Kristine197510y ago· 3 in thread

Why is there a virus scanner on a PC inside the operating room?

Don't tell me that PC is connected to the internet...

rs999gti10y ago

I was going to ask this as well. Why does this PC need to be connected to the internet? If it doesn't need to phone home while operating as a heart monitor then there is no need to have antivirus or have this PC connected to the internet.

Also, plenty of devices not connected to the internet run Windows: ATM's, Billboard, Monitors, etc.

Dumb IT is to blame for this mistake.

jcrawfordor10y ago

> Also, plenty of devices not connected to the internet run Windows: ATM's, Billboard, Monitors, etc.

I hate to break it to you, but, in practice... these things are all typically connected to the internet.

mirimir10y ago

The need to get updates, I bet.

fencepost10y ago· 2 in thread

I see a bunch of folks talking about whether PCs are connected to the Internet and "why was it running antivirus in the first place?" It's called Defense in Depth.

It Does Not Matter if the device is connected to/able to reach the Internet.

First, it probably can reach the Internet in some way simply by being networked. I don't think I've ever seen a medical office (can't speak about hospitals) where medical diagnostic equipment was on a fully-separate network able only to talk to other network equipment and specified data destinations (PACS servers).

Second, I'm not concerned about unpatched, unprotected machines being infected from the Internet. Odds are they're running a restricted version of Windows, with a custom shell and a lot of stuff stripped out. I'm concerned that they're going to be infected by another machine on the network that's gotten infected. With all the past SQL Server security issues a decade or more ago, how many people think those SQL Server boxes could be directly reached from outside the local network?

The conjunction of those two is that even if you firewall all that stuff off, the PACS servers are still on both networks, and are probably running much more interesting and vulnerable stuff than the device controllers.

Sure you can fully wall everything off - it's really easy, just do your X-rays onto film, burn your MRIs and ultrasounds onto CDs, and print your EKGs for later scanning. Oh, and listen to people complain about how out-of-date your systems and procedures are.

There are other factors that come in as well - sure, every device manufacturer could provide fully bespoke diagnostic displays developed from the ground up in artisanal software shops providing full employment for assembly programmers working on embedded systems, along with cohorts of graphic designers creating glorious steampunk-styled interfaces. That's a beautiful dream, keep having it.

For the rest of the world, creating a UI on that custom embedded system running on something from RIM/Blackberry (yeah, they own QNX) is just going to get them crap from people because of A) how clunky it probably looks and B) How could they even consider allowing direct user interaction with the RTOS that was chosen to ensure that the dangerous bits in contact with patients/radiation/irradiated patients were safe?

There's a beautiful world out there somewhere where everything is safe and secure and seamless and updated. The rest of us live in worlds where Joe in Marketing's PC gets infected with something that allows an attacker to start scanning the network for unpatched vulnerabilities on any system, which leads to an out-of-date install of IIS on a legacy server that hasn't been updated because there's no longer a contract with the vendor (or no vendor) but it's around because there's a statutory requirement to keep the data on that system for 7-10 years.

There's a lot of ugliness out there. Antivirus is a way to try to ensure that when (not if) some of it hits you the repercussions are minimized.

tremon10y ago

it probably can reach the Internet in some way simply by being networked

That is simply untrue, you can (and in many cases should) have unroutable subnets. But even if true, that only slightly changes the question: why is operating room equipment networked in the first place? That you've never encountered a proper setup doesn't excuse not having it.

fencepost10y ago

I phrased that badly - it's not that they can reach the Internet, it's that with the exception of true high-security fully-airgapped locations, if the machine is networked then it's almost guaranteed that the Internet (or something on it) can effectively reach out and touch that machine even if it's only via other systems.

I don't work in a hospital environment, haven't for more than a decade and wasn't interacting with clinical systems even then, but my understanding is that a very significant amount of medical equipment was networked even then, and was at least in theory capable of streaming HL7-formatted data to other internal systems for reasons of patient care, billing, or both. How much of that happens in the real world instead of being theoretical is something I can't say, but I'm sure in the 15+ years since I was working with HL7 that hospitals and equipment haven't gotten less networked.

iask10y ago· 1 in thread

There are a couple of things here from my POV, first - I would replace the head of their IT and any senior IT staff - who seem to look for the quickest-then-cheap solutions. Dumb ducks who don't spend the extra time understanding the importance of the infrastructure and the software they install. And also replace the service vendor, if they have one.

I've seen this happen time and again, where companies have some 3rd party service vendors who would install AV software on anything they can get their hand on, even a microwave or coffee machine - just to tell the client "my bill is expensive, but you can feel secure, we installed AV". I despise these folks with a passion.

The problem is not Windows. It's a lack of knowledge and understanding. Simple.

For god's sake - it's 2016 - dump the Anti Virus software. I am gonna make t-shirts this summer with this ;)

technion10y ago

    I would replace the head of their IT and any senior IT staff

It's a very good bet the senior IT team were following orders from somewhere else in the chain here.

callesgg10y ago· 1 in thread

Putting antivirus on equipment at all indicates a much bigger problem.

That the equipment is somehow configured to be susceptible to viruses.

datenwolf10y ago

Well at least in EN62304 the installation of AV on medical devices is recommended. The whole thing reads like as if it was written by people who picked up a few buzzwords and read a few articles in a computer magazine.

saganus10y ago· 1 in thread

Wtf?

"The antivirus was configured to scan for viruses every hour, and the scan started right in the middle of the procedure."

Who configures an antiviurs for an hourly scan on a doctor's computer?

pritambaral10y ago

It wasn't even a doctor's computer, it was apparently an operating-room equipment computer.

YeGoblynQueenne10y ago

>> The antivirus was configured to scan for viruses every hour, and the scan started right in the middle of the procedure.

>> The company claims that they included proper instructions in their documentation, advising companies to whitelist Merge Hemo's folders in order to prevent crashes from happening, so it seems that the whole incident was nothing more than an oversight on the medical unit's side.

So "RTFM"? Not very helpful.

combatentropy10y ago

> they included proper instructions in their documentation, advising companies to whitelist Merge Hemo's folders in order to prevent crashes from happening, so it seems that the whole incident was nothing more than an oversight on the medical unit's side.

And the hospital included full instructions to the software company on how to properly perform a heart transplant, so they were baffled why the programmer just let his teammate die of heart failure.

Come on, this kind of stuff should be a zero-configuration hardware-based black box, with its own buttons, screen, etc. --- not something that needs to be (or even can be) connected to something outside the vendor's total control.

stevetrewick10y ago

From the linked report :

Based upon the available information, the cause for the reported event was due to the customer not following instructions concerning the installation of anti-virus software; therefore, there is no indication that the reported event was related to product malfunction or defect

I beg to differ. I'd consider a momentary loss of file I/O due to lock contention causing a machine to require a reboot a shocking defect in - say - a word processor (which, notably, do not have this problem). That this risk is apparently known and the vendor's sole mitigation is to document a 'Don't do that then' is absolutely 100% an indication of a product defect, even in the absence of an actual occurrence.

kinai10y ago

this reminds me of IT crowds bomb disposal robot: https://www.youtube.com/watch?v=z88b96ECZCE

just perfect

toomanythings210y ago

I don't think they say this device is controlled by Windows but it must be. Why professional software and instruments even consider using Windows is beyond me.

malbs10y ago

We've had issues with the latest versions of kaspersky. A burst of network activity is almost guaranteed to crash a machine.

It took us a while to isolate Kaspersky 10, and it's not even any particular component inside of Kaspersky, but only when all features are enabled. We tried different permutations of features to try and isolate the cause of our crashes, but as soon as you have any one feature disabled in, the crashes stop, Very frustrating because ultimately our clients laid the blame at my feet (new software feature, new release, blah blah blah), and not exactly much you can do in the way of hardening against this particular crash, the app generates a burst of network data, and boom, blue screen/instant reboot.

coldcode10y ago

I worked at a financial company that ran its production Oracle database servers on Windows in the same network as the staff (no firewall) and ran virus checkers on them. Performance was terrible of course.

angersock10y ago

Okay, seriously, I need to say something, because I doubt most of the people commenting in this thread have ever dealt with either health IT, healthcare software, or any of the related nonsense.

There are kinda four flavors of machine setup I ran into while in that field: big server banks for on-site hosting (think huge enterprise VM farms, for data warehousing and record storage and virtual desktop hosting), care provider systems (think like tablets, doctor office computers, nurse workstations, room workstations), cart computers (used for things like running the sonogram or cardiogram equipment, or for other studies), and actual integrated devices (for, say, data collection).

The care provider systems are usually comically locked-down, tablets and phones having the meanest management software they can (no apps, limited connectivity, remote wiping, and so forth). Workstations tend to be centrally managed, have images pushed regularly (ha!), and often use AD and smartcards to handle authentication. One place I've seen took this a step further, and basically just booted users directly into a VM hosted on the server farms mentioned earlier. You can't use USB devices, you have highly-regulated clipboard access, and so forth--this is done to prevent HIPAA breaches. Which is kinda silly given other workarounds, but whatever makes people feel safe and the CIO happy. These workstations run some enterprise version of Windows, probably 7 Pro. Those silly-long extended service agreements you see on Microsoft? Hospitals are some of the people keeping that alive, and they will pay obnoxious amounts of money for the privilege.

The cart computers are typically like the workstations in terms of functionality, but they may have software specific to the device they're talking to. They might not be as locked down (e.g., only acting as thin clients to a remote VM), but they are still running Windows.

The device computers may run some kind of RTOS. In some cases, they'll be running a customized Windows CE installation--which is totally reasonable. There are a lot of good guarantees that that can give a development shop, least of all that they can call up Microsoft instead of StackOverflow and say "Hey, this function does x, it's documented as y, and we're paying you a lot of money, so what the fuck?". Windows Embedded (which is I think the successor, am not sure).

In all of these cases, Windows itself works pretty damned well.

It runs the software everybody needs, it has the enterprise deployment stuff figured out through decades of improvement, and really there is no reason to be scoffing at its choice.

Now, if folks have goofed up and thrown a stupid AV policy on the machine, that's a different question entirely. Health IT is full to the brim of people basically just punching a clock and being unable to get anything done in a reasonable amount of time. Sometimes, they do awesome things, but mainly they are just custodians standing between doctors and really really stupid policy decisions that seemed good at the time.

EDIT: Removed unrelated example at top.

billforsternz10y ago

"A critical medical equipment crashed during a heart procedure due to a timely scan triggered by the antivirus software installed on the PC to which the said device was sending data for logging and monitoring."

That should be untimely. The opposite of timely.

firebones10y ago

For what it is worth, Merge is now part of IBM Watson.

http://www.merge.com/News/Article.aspx?ItemID=660

Welcome to the Health Cloud Powered by Watson.

fla10y ago

How can a medical device be certified for running on 'user hardware' (=uncontrolled environment).

Something is probably missing from the article. IMO, the device in question wasn't critical at all, and a failure could be expected.

mtgx10y ago

Microsoft must be relieved this wasn't yet another Windows 10 upgrade horror story.

j / k navigate · click thread line to collapse

204 comments

86 comments · 25 top-level

Avernar10y ago· 16 in thread

"Merge says the antivirus froze access to crucial data acquired during the heart catheterization. Unable to access real-time data, the app crashed spectacularly.

But instead of having resilient software it's "the anitvirus software's fault" or "it's IT's fault" when something goes wrong because of their bad management/engineering decision.

sathackr10y ago

Exactly this. As I was reading the article I hoped to find this exact point in the HN comments.

The fault lies in the bad software. It could have been the indexing service, online defrag, automatic updates, or any of the other various background processes windows runs.

If it is critical software, it should be designed in a way to not fail when something non-critical malfunctions, and even the critical pieces should be built with redundancy.

scruple10y ago

4 more replies

brians10y ago

2 more replies

LeifCarrotson10y ago

National Instruments provides real-time IO systems, but they cost a lot more than the basic systems. You can write driver-layer code that will run in real-time on Windows, but that takes longer.

vardump10y ago

It works as long as full code path and data it requires is not paged out. Or some other thread doesn't consume I/O resources, etc.

In other words, it's not guaranteed at all.

50 ms guaranteed response time requires the code and data is in non-paged pool.

sillysaurus310y ago

Software shouldn't necessarily try to account for errors in that manner. Usually, the most graceful thing to do is to exit cleanly.

The antivirus basically unplugged the disk. What can it do to recover? There's nothing to be done.

The situation in the article seems closer to hardware failure than a design oversight.

Avernar10y ago

> When the disk is plugged back in, it should be able to restart without any problems. But I think that's a different kind of resiliency than what you're referring to.

Yes and no. I was referring to restarting internally when the error condition went away but restarting the app and waiting for telemetry to return can be a valid solution.

> The situation in the article seems closer to hardware failure than a design oversight.

Hardware failure is almost always a permanent condition. This was a "my I/O stopped briefly and would have came back if my code could handle it".

1 more reply

wpietri10y ago

I totally disagree.

Sure, usually the most graceful thing to do is exit and hope a human fixes it. But that's usual because the usual condition is that sudden failure is NBD and a human is right there to screw with it.

[1] https://github.com/Netflix/SimianArmy/wiki/Chaos-Monkey

[2] https://en.wikipedia.org/wiki/Crash-only_software

[3] http://doc.akka.io/docs/akka/snapshot/scala/fault-tolerance....

1 more reply

stevetrewick10y ago

ryandrake10y ago

mappu10y ago

Better to crash (and restart quickly into a known state) than to enter a rare, untested code path.

brians10y ago

For software critical to human life, test the rare code paths.

1 more reply

Avernar10y ago

But their app crashed. And hard. It required a machine reboot to restart. While it returned the machine to a known state it wasn't quick.

And in medical software, all code paths need to be tested.

stevetrewick10y ago

File I/O error handlers should not be a rare untested code path.

Joof10y ago

Agreed. This is what happens when software fails catastrophically. People die and we lose confidence in software that could save lives if it worked properly.

dTal10y ago

You'd think it would take longer to write the instructions than it would to just throw a try-catch block in there.

11thEarlOfMar10y ago· 13 in thread

/rant/

/xrant/

raverbashing10y ago

This

This machines costs hundreds of thousands of dollars.

There should be no excuse for using Windows. None.

I would not be surprised if the "antivirus" thing was some PHB requirement

Amezarak10y ago

Is there a reason to believe that choosing Windows was a bad decision?

3 more replies

jcrawfordor10y ago

I still think this was a poor decision, but it's a different kind of poor decision

chapium10y ago

1 more reply

ArkyBeagle10y ago

Windows ( at least 7 & XP) can be stripped down to something approximating a properly running system. This isn't easy.

Having an anything-critical Windows machine on the open Internet might not be such a wonderful idea.

copperx10y ago

I would expect these kinds of systems to be running a soft realtime OS. Or at the very least a run of the mill OS with no extraneous software running in the background.

Dwolb10y ago

3 more replies

beachstartup10y ago

correlation doesn't imply causation.... except when someone selected windows, and then made 50 other moronic decisions also.

half the work done in this industry is just dealing with stupid decisions made by stupid people. i just accept this now.

tluyben210y ago

2 more replies

dredmorbius10y ago

Only half?

I mean this quite seriously: you're hugely underestimating. Possibly unaware.

rsync10y ago

"Bringing Windows into the architecture of any type of capital equipment control system is a bane. A scourge."

We need to go much further than this, since many people will "solve" this problem by using a different platform than Windows.

In reality, bringing networking of any type into capital equipment control systems or critical infrastructure is the bane ... the scourge.

Whatever convenience or perceived function that networking (including very local networking, like USB) is dramatically outweighed by the additional attack surface.

Go back to sneakernet and check your facebook at home, Mr. Nuclear Plant Worker.

CaptSpify10y ago

The issue with this is: Many of these systems need to send data to other system on the network. You'll have to send the data over some kind of PACS network, or you'll just have to use a USB.

I agree that's a "better" setup than networking, but I still don't think having staff plug random usb devices into your medical equipment is a great idea either.

rwmj10y ago

And if you want PCI-DSS [credit card handling] certification, then you'd better be running AV software, even when it's completely inappropriate.

GunboatDiplomat10y ago· 7 in thread

Why on earth is medical equipment running standard Windows? This is the ideal location for some basic RTOS or even just an embedded Linux. Seems like a huge cost and risk for no gain.

tluyben210y ago

nonbel10y ago

>"hindered by knowledge"

This is a great phrase. "Joe was hindered by knowledge of what a p-value means and so didn't claim he discovered a key to understanding the disease."

riyadparvez10y ago

That was my first thought too. And why there is an anti-virus running? This equipment should not be connected to the internet nor some staff should plug-in a flash drive on the first place.

ars10y ago

That's not true. They have to keep a record of the data for the patient file. So it does have to communicate remotely in some fashion.

1 more reply

devonkim10y ago

technion10y ago

I feel there are two very contradictory views on HN. The first of these, is that anything safety critical needs to land on an RTOS. The second of these, is to avoid C.

Both of these have reasons behind them and appear to make sense. Leaders in the RTOS space appear to be QNX and vxworks. Suitable languages people raise are Rust and Go.

Based on some Google time, neither of these platforms support either of these OS's. Multiple "Introduction to vxworks" documents are all exclusively in C.

In terms of accessibility, safe languages are far easier for someone to get their hands on, test to death, than some of the OS's recommended.

kosmic_k10y ago

I never realized just how lucky I was to have been born when I was. I can't imagine building embed devices which aren't running a very simply super loop or an ARM RTOS.

steven201210y ago· 4 in thread

Antivirus scans are one of those things added on IT checklists to cover their ass whenever something wrong happens.

But it rarely is useful. It only causes problems. We've seen so many issues related to virus scans throughout the years it's crazy.

jconley10y ago

AnthonyMouse10y ago

It's more like the requirements cronies put into defense contracts to make sure the contractors make a lot of money.

jcrawfordor10y ago

Yeah, you don't click on weird emails and don't go to sketchy websites. Try managing IT security for an enterprise of 10,000 employees. A/V will save your ass hundreds of times every single day.

qb4510y ago

> I haven't used virus scan on my main desktop for over 10 years because I don't click on weird emails and I don't go to sketchy websites ever.

Haha, I used to be like that as a teen in Windows 9x era until one day I ran tcpdump on the router ;)

ezoe10y ago· 4 in thread

This situation is even funnier(and sadly very seriously flawed) in Japan.

Medical equipment require an authorization to use. Any change to the medical equipment requires another authorization or it's prohibited.

By "any change" , it includes Windows Update(it changes the system obviously).

The result: they use anti-malware software to protect(or rather, believed to protect) unpatched Windows.

At least one anti-malware software company(Trend Micro), marketing that their software can protect the medical equipment in such situation.

exhilaration10y ago

But... what about AV/malware definition updates? Doesn't that fall under "any change"?

symtos10y ago

and what about security updates to the snakeoil they sell, eg. https://bugs.chromium.org/p/project-zero/issues/detail?id=69...

Blackthorn10y ago

> Any change to the medical equipment requires another authorization or it's prohibited.

Honestly, this isn't a bad decision. If the device was tested and certified with specific software, a software upgrade is not guaranteed to not cause a problem.

symtos10y ago

using software with known problems in order to avoid potential problems from an upgrade does not seem like a non-bad decision

1 more reply

CaptSpify10y ago· 3 in thread

https://xkcd.com/463/

The whole structure is wrong. I used to work in medical equipment repair. Windows Embedded is running so many devices it's not funny. But it's not just Windows that's the problem.

I put a linux-system on a PACS network to diagnose equipment. It was a headless, and we asked the IT group to block it off from the Internet.

Hospital IT: "Does it have antivirus?"

Me: "..."

SixSigma10y ago

List of FDA medical equipment recalls for 2016

http://www.fda.gov/MedicalDevices/Safety/ListofRecalls/ucm48...

At least three of them are Class 1 - May cause death

And all of those are software related, none run Windows

http://www.fda.gov/MedicalDevices/Safety/ListofRecalls/ucm48...

kosmic_k10y ago

That is astoundingly horrifying, especially the Class 1's which were distributed for over five years.

2 more replies

ptaipale10y ago

Thank you; I hadn't spotted that XKCD quote about teacher and condom, but it will be very appropriate the next time someone asks me about the antivirus in the server software we do (on Linux).

dchichkov10y ago· 3 in thread

Let me surprise you, with the code quality that sometimes is running in what is actually 'life-critical' software.

Keyframe10y ago

state10y ago

I understand why you probably don't want to put it up, but boy would it be fun to look at the code you're describing.

dchichkov10y ago

pdkl9510y ago· 3 in thread

[1] http://sunnyday.mit.edu/papers/therac.pdf

chestervonwinch10y ago

pdkl9510y ago

jschwartzi10y ago

If you're interested in the kinds of things the industry has to consider in the US, take a look at the FDA guidance for the 510k submittal process.

Kristine197510y ago· 3 in thread

Why is there a virus scanner on a PC inside the operating room?

Don't tell me that PC is connected to the internet...

rs999gti10y ago

Also, plenty of devices not connected to the internet run Windows: ATM's, Billboard, Monitors, etc.

Dumb IT is to blame for this mistake.

jcrawfordor10y ago

> Also, plenty of devices not connected to the internet run Windows: ATM's, Billboard, Monitors, etc.

I hate to break it to you, but, in practice... these things are all typically connected to the internet.

mirimir10y ago

The need to get updates, I bet.

fencepost10y ago· 2 in thread

I see a bunch of folks talking about whether PCs are connected to the Internet and "why was it running antivirus in the first place?" It's called Defense in Depth.

It Does Not Matter if the device is connected to/able to reach the Internet.

There's a lot of ugliness out there. Antivirus is a way to try to ensure that when (not if) some of it hits you the repercussions are minimized.

tremon10y ago

it probably can reach the Internet in some way simply by being networked

fencepost10y ago

iask10y ago· 1 in thread

The problem is not Windows. It's a lack of knowledge and understanding. Simple.

For god's sake - it's 2016 - dump the Anti Virus software. I am gonna make t-shirts this summer with this ;)

technion10y ago

    I would replace the head of their IT and any senior IT staff

It's a very good bet the senior IT team were following orders from somewhere else in the chain here.

callesgg10y ago· 1 in thread

Putting antivirus on equipment at all indicates a much bigger problem.

That the equipment is somehow configured to be susceptible to viruses.

datenwolf10y ago

saganus10y ago· 1 in thread

Wtf?

"The antivirus was configured to scan for viruses every hour, and the scan started right in the middle of the procedure."

Who configures an antiviurs for an hourly scan on a doctor's computer?

pritambaral10y ago

It wasn't even a doctor's computer, it was apparently an operating-room equipment computer.

YeGoblynQueenne10y ago

>> The antivirus was configured to scan for viruses every hour, and the scan started right in the middle of the procedure.

So "RTFM"? Not very helpful.

combatentropy10y ago

And the hospital included full instructions to the software company on how to properly perform a heart transplant, so they were baffled why the programmer just let his teammate die of heart failure.

stevetrewick10y ago

From the linked report :

kinai10y ago

this reminds me of IT crowds bomb disposal robot: https://www.youtube.com/watch?v=z88b96ECZCE

just perfect

toomanythings210y ago

I don't think they say this device is controlled by Windows but it must be. Why professional software and instruments even consider using Windows is beyond me.

malbs10y ago

We've had issues with the latest versions of kaspersky. A burst of network activity is almost guaranteed to crash a machine.

coldcode10y ago

angersock10y ago

Okay, seriously, I need to say something, because I doubt most of the people commenting in this thread have ever dealt with either health IT, healthcare software, or any of the related nonsense.

In all of these cases, Windows itself works pretty damned well.

It runs the software everybody needs, it has the enterprise deployment stuff figured out through decades of improvement, and really there is no reason to be scoffing at its choice.

EDIT: Removed unrelated example at top.

billforsternz10y ago

That should be untimely. The opposite of timely.

firebones10y ago

For what it is worth, Merge is now part of IBM Watson.

http://www.merge.com/News/Article.aspx?ItemID=660

Welcome to the Health Cloud Powered by Watson.

fla10y ago

How can a medical device be certified for running on 'user hardware' (=uncontrolled environment).

Something is probably missing from the article. IMO, the device in question wasn't critical at all, and a failure could be expected.

mtgx10y ago

Microsoft must be relieved this wasn't yet another Windows 10 upgrade horror story.

j / k navigate · click thread line to collapse