England’s NHS plans to share patient records with third parties (opens in new tab)

(ft.com)

51 pointsSpaceNinja4y ago23 comments

23 comments

People should note that this data cannot really be anonymised. This is because you only need a post code, date of birth and sex to determine who the majority of people are.

"A 2000 study found that 87 percent of the U.S. population can be identified using a combination of their gender, birthdate and zip code."

https://en.wikipedia.org/wiki/Data_re-identification

UK postcodes are more specific I believe.

kieranmaine4y ago

It seems the NHS will still be able to identify patients. FTA:

"Data that directly identifies patients will be replaced with unique codes in the new data set, but the NHS will hold the keys to unlock the codes “in certain circumstances, and where there is a valid legal reason”, according to its website. "

LatteLazy4y ago

What they don't seem to cover is what they think directly identifying data means.

Does anyone actually know what that means? I wouldn't know from a medical record how much data would need to be removed to make it anonymous. It likely depends on the record. And there are different answers that can both be right (so which are they using?).

randomsearch4y ago

Why not remove postcodes?

LatteLazy4y ago

Because you can do something very similar with combos of other fields in the data. That's the problem here: no one knows what combination of details deannonimise a record. So what do you remove, all of them? So the claim it is anonymous is BS. And once the records are out there, there is no way to get them back...

Edit: as an example, I was in a car accident as a 17 year old and broke my jaw. If you Google my name and the location you'll see the news article. I was the only person treated at the local hospital for a broken jaw that day. You just deannonimised me.

Or go again: I'm the only person from my town who went to my university in the year I went. Just look for someone treated in <home area> term time and <uni medical clinic> term time 2002-2008.

angio4y ago

For example, to book the NHS vaccine you only need a person postcode, first and last name, and date of birth. All this information is easily available, for example from companies house.

1 more reply

bartdewitte4y ago

In Europe Google is offering large sums for clinical data that is stored within the records of their Electronic Medical Records. If you start selling data, capital will lead to a consolidation of data. Creating Information asymmetries in healthcare might be the stupidest thing the NHS can do, our future generations will be fighting the monopolies that could have been voided if we treat data and its derivates as a common good.

dekhn4y ago

Do you have a reliable citation showing Google is doing this?

pydry4y ago

It's not stupid. They know what they're doing. Any more than driving all the nurses to quit is stupid.

screye4y ago

A devil's advocate:

Giving access to nation's healthcare data for statistical and ML uses can speed up development for ML in diagnostics by a huge amount.

Once you scrub the source data to remove birth dates, report creation dates & zip codes, it should be sufficiently anonymized to be traceable back to the individual. We can enable some level of differential privacy on top as well.

ML's 2 big leaps of the last decade 2012 CNNs and 2017-18 pre-trained transformers both came off the back of a leap in data availability (Imagenet for CNNs and Scraping the entire internet for BERT).

Individual hospitals and the startups they bankroll have their inhouse ML teams, but closed data and unwillingness to disseminate has made the field move at a snail's pace. Additionally, Generalizability of any kind won't be achieved until the data gets scaled up past small geographic pockets and patient sets. This is especially true in medicine which has a long-tail problem. Lastly, aggregating data together lends a natural anonymity to each user who's data is shared within the dataset.

IMO, disease diagnostics is one the most ideal castings for a problem in ML. A purely technical trade where data and decisions have a degree of exactness and concepts like conditional probability are a natural fit. The only problem is that the pipeline is largely still analog. This means that the data collected about the doctor's diagnostic processes still comes out incomplete and privacy protections make sure it stays on a scale small enough to make ML difficult.

kieranmaine4y ago

This data will not be anonymized enough IMO. FTA:

btown4y ago

This makes sense though from a care optimization perspective though, yeah? Say an ML model is developed to predict a rare disease, and person 12345 scores highly by that model but has not been tested. Without IDs, the NHS would need to replicate the model on secure infrastructure to identify and potentially save the life of the person. With IDs, researchers most familiar with the methodology who already have the infrastructure set up can simply give a set of high-scoring IDs to NHS administrators.

1 more reply

motives4y ago

To counter your devils advocate, universities within the UK which perform the most cutting edge research in this specific field are already closely linked to specialist hospitals, and can obtain the kind of data they need on a targeted and explicitly consensual basis. The fallacy that more data inherently leads to better ML affords poorer quality research, and explainability often lags behind in these cases.

Its hard to compare NLP (such as pretrained transformer models) to medical ML because there are real and potentially fatal implications to misdiagnosis. The focus should be on small scale and explainable ML, not brute forcing patterns across large populations (which is more effective for insurance companies than clinicians). FWIW, I'm a massive fan of the potential of CV in diagnosis, and in aiding spotting abnormalities early, but I think the proposed opening up of data is absolutely the wrong way to see innovation in this field.

kevinbowman4y ago

Although the FT is paywalled, from a web search the same topic is covered at https://digitpatrox.com/englands-nhs-plans-to-share-patient-... (I have no idea how reputable that site is though). Quoting the start of it: “”” England’s NHS is making ready to scrape the medical histories of 55m sufferers, together with delicate info relating to psychological and sexual well being, prison information and the abuse of adults and youngsters, right into a database it is going to share with third events. The info assortment venture, which is the primary of its form, has triggered an uproar amongst privateness campaigners, who say it’s “legally problematic”, particularly as sufferers solely have a couple of weeks to decide out of the plan. NHS Digital, which runs the well being service’s IT techniques, confirmed the plan to pool collectively medical information from each affected person in England who’s registered with a GP clinic right into a single lake that can be obtainable to educational and business third events for analysis and planning functions. “””

This link is also relevant for people registered with the NHS: https://www.nhs.uk/your-nhs-data-matters/manage-your-choice/

CTOSian4y ago

Its since sometime ago that GPs advertise and use non-NHS online services to force their patients to order their repeated prescriptions or book appointments, they get their f commission and share the data, screw them all.

fragileone4y ago

Under article 6 of the GDPR this is allowed since "processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;", thus informed consent isn't necessary and opt-out is legal.

Whilst legally permissible, being opt-out with highly sensitive information is detestable and shows the GDPR doesn't go far enough.

Silhouette4y ago

Article 6 is not the one that really matters here, because we're talking about health-related personal data, which is one of the special categories. Article 9 is the main one dealing with those and it imposes significantly stronger requirements. In particular, the various conditions under which it may be legal to process that data under Article 9 make repeated references to requirements for safeguards and professional secrecy. They still don't seem to outright prohibit general data lakes and opt-out arrangements, though.

Silhouette4y ago

There is something very wrong with the attitude towards health-related personal data in the UK lately. If you talk to real clinical professionals like doctors and nurses, they tend to be very aware of privacy as a matter of professional ethics. But the people making a lot of the policies and setting up a lot of the services that handle this sensitive data aren't generally the clinical professionals, they're more likely administrative staff and managers, whose priorities are not necessarily so constrained and who may not have the technical knowledge to understand the implications of what they are doing.

A case in point: some dentists in our area are now asking for a comprehensive medical questionnaire (far more than just dental history or medical conditions that might affect appropriate dental treatment) to be completed for any new patient, and then emailed to them. There's not even a pretence of acceptable security and privacy protections. With some other dentists, they ask you to use an online system to send them similar information, and that system is run by a commercial entity not based in this country.

Given how badly the figures right at the top of the health service and the members of the Government who are responsible for it dropped the ball when it came to privacy and COVID apps, it's hard to have much faith in them to properly operate centralised systems that hold substantial information about everyone for very generic-sounding purposes like "planning" and "research".

The fact that this particular opt-out can be completed easily online by adults yet requires a parent or guardian to jump through hoops involving filling in PDF forms in order to opt out of sharing sensitive data about a child says a lot about the level of ethics involved here, and none of it is good.

As a final observation, nothing about opting out of this kind of generic, large-scale system precludes participating in legitimate research conducted with appropriate safeguards and ethical standards. Doctors in a certain field may be working with a research group to investigate a particular condition in their field and its treatment, and can forward an invitation to any of their patients who might have that condition explaining the research and asking if the patient would be willing to participate in the research. I've seen one of these, and the information provided was very clear about exactly what data would be shared, what it would be used for, who would have access to it and with what safeguards to prevent unauthorised access, arrangements for destroying it after the research had been completed -- basically everything you'd hope a responsible organisation doing legitimate medical research would be careful about. So the kind of useful research where someone privacy-conscious might still choose to participate for the greater good isn't necessarily undermined by opting out of generic data-sharing consents.

eurasiantiger4y ago

This already routinely done all over the world.

LatteLazy4y ago

Two things:

Citation needed (sorry).

It should be noted that the NHS is one of the largest, longest, most standardised medical record sets in the world. This is because it's for the whole (67m person) country, and because the NHS is so old records are old and centralised to a degree you don't see in private-healthcare, federal or smaller countries. That makes this of interest imho.

The NHS DBs have been used for really good medical research in the past (exactly because of the reasons above). That's fine. Buts it's different to just sharing the data with anyone...

sealeck4y ago

Because most countries don't run a national healthcare system; this is data collection by the state (not a private company) and should be treated accordingly.

4f776169734y ago

But then again, the State is part of the Five Eyes, so there’s not much to expect in terms of privacy anyway.

j / k navigate · click thread line to collapse

23 comments

LatteLazy4y ago

People should note that this data cannot really be anonymised. This is because you only need a post code, date of birth and sex to determine who the majority of people are.

"A 2000 study found that 87 percent of the U.S. population can be identified using a combination of their gender, birthdate and zip code."

https://en.wikipedia.org/wiki/Data_re-identification

UK postcodes are more specific I believe.

kieranmaine4y ago

It seems the NHS will still be able to identify patients. FTA:

LatteLazy4y ago

What they don't seem to cover is what they think directly identifying data means.

randomsearch4y ago

Why not remove postcodes?

LatteLazy4y ago

Or go again: I'm the only person from my town who went to my university in the year I went. Just look for someone treated in <home area> term time and <uni medical clinic> term time 2002-2008.

angio4y ago

For example, to book the NHS vaccine you only need a person postcode, first and last name, and date of birth. All this information is easily available, for example from companies house.

1 more reply

bartdewitte4y ago

dekhn4y ago

Do you have a reliable citation showing Google is doing this?

pydry4y ago

It's not stupid. They know what they're doing. Any more than driving all the nurses to quit is stupid.

screye4y ago

A devil's advocate:

Giving access to nation's healthcare data for statistical and ML uses can speed up development for ML in diagnostics by a huge amount.

ML's 2 big leaps of the last decade 2012 CNNs and 2017-18 pre-trained transformers both came off the back of a leap in data availability (Imagenet for CNNs and Scraping the entire internet for BERT).

kieranmaine4y ago

This data will not be anonymized enough IMO. FTA:

btown4y ago

1 more reply

motives4y ago

kevinbowman4y ago

This link is also relevant for people registered with the NHS: https://www.nhs.uk/your-nhs-data-matters/manage-your-choice/

CTOSian4y ago

fragileone4y ago

Whilst legally permissible, being opt-out with highly sensitive information is detestable and shows the GDPR doesn't go far enough.

Silhouette4y ago

eurasiantiger4y ago

This already routinely done all over the world.

LatteLazy4y ago

Two things:

Citation needed (sorry).

The NHS DBs have been used for really good medical research in the past (exactly because of the reasons above). That's fine. Buts it's different to just sharing the data with anyone...

sealeck4y ago

Because most countries don't run a national healthcare system; this is data collection by the state (not a private company) and should be treated accordingly.

4f776169734y ago

But then again, the State is part of the Five Eyes, so there’s not much to expect in terms of privacy anyway.

j / k navigate · click thread line to collapse