"Difficult" to de-anonymize is not enough. It must be impossible, and the burden of risk must be on Netflix, not the customer. We're sympathetic in this case because the contest is innovative and interesting. Imagine a slightly different story. In this one, the FBI asks Netflix for "anonymized" information, then de-anonymizes it and starts wiretapping people considered "suspicious." I think we would be rightly appalled at the idea of the government monitoring the movies we watch, and we would criticize anyone handing over the information they need to do it without asking or notifying us.
It's the same thing here, and Netflix should ensure opt-in with full disclosure of the potential for de-anonymization for the same reasons.
When I sign up for a service, I don't expect the vendor to publish data of my transactions to the entire world, even if they claim it is anonymized.
http://userweb.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf http://userweb.cs.utexas.edu/~shmat/netflix-faq.html
When you're using private data in any public way, or even with "partners or affiliates," you need to be very careful, watchful and responsive.
One's "political orientation maybe revealed by his strong opinions about "Power and Terror: Noam Chomsky in Our Times" and "Fahrenheit 9/11," and his religious views by ratings on "Jesus of Nazareth" and "The Gospel of John." Even though one should not make inferences solely from someone's movie preferences, in many workplaces and social settings opinions about movies with predominantly gay themes such as "Bent" and "Queer as folk" (both present and rated in [one individual's] Netflix record) would be considered sensitive"
For certain types of agreements, good faith is not enough. Netflix chooses to go into a business where it is privy to private information about its customers. The onus is on Netflix to protect that information.
I would say the same thing had hackers cracked their security and made off with the data. Good faith efforts that fail to secure the data are not enough, they must succeed in protecting the privacy of their customers.
If they know that the information can be de-anonymized using publicly available information, have they really made a good-faith effort?
See http://arstechnica.com/tech-policy/news/2009/09/netflix-priz... and its links.
This post, which is linked from the ARS article states a given birthdate, gender, and zip code can uniquely identify most Americans. So, while the parent should probably specify birthdate instead of age, a name is definitely not needed.
As stated elsewhere Netflix should just have a clear opt-in for sharing the data and we can continue on.
I am a little surprised to see that people who are generally uproar against companies not respecting privacy think it is perfectly ok to do it if it is for "science".
This is one of the big reasons why academics cannot release data BTW, especially in the social science: anonymizing data almost inherently means destroying valuable information.
And I agree with one of the comments on that post - why doesn't Netflix have people opt-in to have their data anonymized and used for this purpose?
Edit: replaced bureaucracy with lawsuits
This was a privacy issue, and something legitimate to address.
I find it surprising that there wasn't a middle-ground which anonymized or didn't include personally identifying information.
I feel the lack of a middle-ground is probably due to the lawsuit plaintiffs refusing to back down (presumably for financial gain) and thus causing Netflix to hedge their risk.
Selection bias?
People worried that some geek will find out they watched Who's the Boss reruns will miss out.
Likely, Netflix has the power to start it's own social network platform based on it's existing users' movie preferences. Some user's might be proud to say that they rate specific movies highly, and correlate to particular algorithms based on collective opinions.
For instance: 'Click here to add user X's preferences to your algorithm for Y genre'
See this: http://godplaysdice.blogspot.com/2009/12/uniquely-identifyin...
I imagine gender and DoB factor in heavily to something like a recommendation engine, and I'm sure zip code would come into play when trying to get those last couple percent as is the case with the Netflix prizes.
"This is Neil Hunt, Chief Product Officer for Netflix."
No vague hiding behind an unidentified team or blog. A person has chosen to identify himself with the decision, and the company chose to present it that way.
The general tone is also positive. This is a good way to communicate.
Contrast with Amazon's anonymous whine on their kindle blog when they gave in to MacMillan. Not signed by a person, not even "the team." And full of blame and "you'll see!"
http://www.amazon.com/tag/kindle/forum/ref=cm_cd_et_md_pl?_e...
Looking at Netflix's blog they have 3 posts for the entire year.
Looking at Amazon's discussion forums, there are a swath of official announcements, including one that informs customers how to access 2 million free books outside of Amazon.
Sure, having a name in front of the post is nice I guess, but in terms of consistent and valuable customer communication I think Amazon wins. Also here's an example of an apology from Bezos himself
http://www.amazon.com/tag/kindle/forum/ref=cm_cd_ef_tft_tp?_...
This is professional - but it also lacked much backbone.
Lame
Regards
The Tech Community