I worked for a company crawling Facebook data by creating viral apps the year the original API came out. By now I am sure this is done by many companies.
Why is any of this news? My understanding is that companies harvesting social networking data via viral apps and then reselling it to perform targeted voter advertising is literally a 10 year old concept. Were any laws broken here? Were there any techniques used here that were novel or done by one political party and not the other? Why are we talking about this one firm and not the many others that surely exist that are trying to do the same thing for <insert political candidate of choice>
If advertising isn't manipulating and brainwashing, what is it? As far as I can tell, advertising is precisely that.
This has changed a lot in the last few years though, which might explain why the news agency now have to work through (at least!) 10 years of tech news backlog. I'd definitely like to see coverage of these topic from 'traditional journalists' who then bring this stuff into context and link it with politics for instance. It's a little sad that we needed the politic right and their friends to bring these things to public attention.
> Were any laws broken here?
In Germany they would have broken it by that. At least here every website needed already 10 years ago not only Terms of Services but also a data privacy section/page. Such a page would be of no use if you could collect data from people before they even visit your website.
This 100% untrue:
https://bits.blogs.nytimes.com/2008/11/07/how-obamas-interne...
https://www.theatlantic.com/politics/archive/2009/10/exclusi...
These are only two links but let me assure you if you're interested google is full of articles from the 2008 election talking about how vital the internet and data was to the Obama strategy.
I very clearly remember these articles and many others at the time talking about the new "big data" strategies utilized in both 2008 and 2012. To me that is what makes this whole "we scraped data and stole the election for trump" narrative seem extremely suspicious.
If you go back to 2008 election, the media was praising Obama as the first social media president. Remember how well obama used youtube, myspace, facebook, reddit and the burgeoning social space during his election? It's strange how the media is now attacking the social media space they loved so much because trump won the election.
> Why is any of this news?
I think it's because the media and the democrats and a large segment of the elites need something to blame for trump's win. They don't want to blame hillary or themselves for the loss, so they attack social media.
During the 2016 election, Trump was complaining about foreign interference in the elections. And Obama stated there was no foreigner interference and that Trump was whining because he was losing in the polls. Back then, the traditional media was backing obama and mocking trump. Now that trump has won, the traditional media is the one pushing the foreign interference narrative.
But I guess it is all conjecture. But ever since trump won the election, there has been a relentless propaganda campaign against social media by the establishment. You can't go a day without seeing a propaganda piece on traditional or social media about how bad social media is.
There is still no evidence that 2008 had any foreign influence. The tactics used in 2008 did not include the sort of misinformation information warfare now being conducted.
In contrast, in Brexit and Trump's wins this article is claiming that there is conclusive foreign interference.
Regarding the establishment, the only propaganda campaigns I can see being waged are against the mainstream news (the ongoing fake news propaganda) and against the tech companies who I suppose are "new money" and not in with the oil and finance czars funding this sort of thing.
The article is really interesting, you should read it. It basically suggests that democracy has largely failed in the age of information warfare. Targeted campaigns by rich elites and foreign governments can now influence votes and psychology on a massive scale. It's no longer about "my side and their side." All of us are on the losing side here.
- The advertisements were not overt ("fake news") - It raises the barrier of entry to the political process - Filter bubble effects
Or you can believe a story about "elites" and "biased media".
This should be the main takeaway from this article--that Facebook relies on the honor system for protecting user data.
Breaking a company's TOS isn't a crime in and of itself, and social media data has been used for political targeting for years now. Insinuating that Trump won because of a nefarious brain control operation fueled by data from a "data breach" is irresponsible.
Facebook gives companies access to the data (e.g., "for research") but they're not allowed to sell or provide that data to third parties (which is what these people did).
IMHO, but I am not a lawyer - clearly the law was broken, Data Protection Act.
I think the clue is in the article:
> ... Russians ... had used the platform to perpetrate “information warfare” against the US
I understand that Hacker News has been accused of turning into reddit since reddit became a thing, but when the top most comment is from a guy who didn't even bother to read the article linked, there is a very little in the way distinction between the two sites.
If this incident helps protect user privacy further it would be great. However I doubt it would happen at all. Most likely they'd just take this opportunity to aim another round of barrage at Trump instead of talking any substance about the issue itself. The purpose of this reportage is political attack against Trump instead of any concern for privacy in the first place.
Using fb to advertise for Obama, Trump, etc, is ok
However that's not what has been done, but the a) use of shills and fake personas to pump up opinion b) creation of fake "grassroots movements" and "news articles" with a divisive purpose
https://thinkprogress.org/russia-facebook-pages-sophisticate...
https://www.reddit.com/r/RussiaLago/comments/7y6ola/there_ha...
3/14/2008
>The Obama campaign's chief strategist is a master of "Astroturfing" and has a second firm that shapes public opinion for corporations
https://www.bloomberg.com/news/articles/2008-03-14/the-secre...
> However that's not what has been done, but the a) use of shills and fake personas to pump up opinion b) creation of fake "grassroots movements" and "news articles" with a divisive purpose
This has been done since forever. People supporting Obama did in 2008. Those supporting Ron Paul did it. Bernie sanders supporters did it and so did hillary supporters.
See my post: https://www.facebook.com/mstefanow/posts/10156280067194886
Since when NO NEWS is NEWS?
"Why is any of this news? My understanding is that companies harvesting social networking data via viral apps and then reselling it to perform targeted voter advertising is literally a 10 year old concept."
Other team didn't realize such thing as the internet exists?
(I'm outraged that this thing hit the news, as if it wasn't something already known)
When did Obama’s campaign ever do that?
No he didn't. On purpose
https://www.bloomberg.com/news/features/2017-12-21/inside-th...
What's the issue here? Selective information distribution is rooted in the society, people are O.K. with some information to be known to some people while kept secret from others due to implication differences.
I.E. I'm fine to be profiled for selling me chocolates but I'm not O.K. to be profiled to be manipulated to select public officials or to make my mind about controversial topics like the god, abortions, guns etc. I expect to be exposed in a proper way to these topics, i.e. proper journalism and discussions.
Facebook lets companies bid for ads to show you, based on Facebook’s data about your interests and demographics. If you never engage with the ads there is no information leakage.
It’s the difference between telling a random person “I’ll tell my gay friends about your party” and “Did you know that Bob is dating Steve?”.
So if you believe Facebook sells your information, then you also believe that Facebook is directly lying about it on a help page about that specific topic. Sometimes conspiracy theories are true, so it's not impossible that that's what happening, but it's not like Facebook is openly selling your information.
But what if you do?
That's not true.
Not only conglomerates but anyone with the money and know-how. Example P&G proxy fight:
https://www.reuters.com/article/us-procter-gamble-trian-inve...
Neubecker said he has seen several ads on his Facebook feed that link to Trian’s “Revitalize P&G” website and to videos of Peltz and former P&G Chief Financial Officer Clayton Daley, who is advising Trian.
The video of Peltz features him sitting in Trian’s Park Avenue, New York City headquarters, discussing P&G’s future, gripping a Trian-labeled coffee mug that reads “Sales up, expenses down.”
In response, P&G has called upon more than a century of product marketing experience with its own “Vote Blue” campaign.
One YouTube video begins with an image of P&G’s blue logo and a banner proclaiming “Every Single Vote Matters!”. A narrator and series of slick images offer step-by-step instructions, ending by asking viewers to vote for the blue proxy card and to throw Trian’s white card in the recycling bin.
Trian won this fight.
Please provide some evidence that anyone with money and know how can harvest tens of millions of accounts from Facebook.
Your example does not address this and seems just like normal Facebook targeting to me.
Even so, there is a significant difference between Coke trying to sell me a flavored soft drink and a firm tweaking my emotions to get me to abstain from voting with false information or to vote against my best interests with false information.
There are definitely people who are susceptible to psycho-graphic warfare and we need to protect them in order to protect our democracy.
This makes "enforcement" basically impossible because you can't have a news outlet without editorial decisions. A better way to go about it would be to try incentivize media outlets to make a good faith effort at "both sides" journalism, but members of both political sides have been attacking the media for doing just that since the 2016 election season.
Better ban 24 hour news networks then. And any form of political advertising.
1. It is not normal marketing to have tens of millions of Facebook account data on your own private server. This isn't standard practice by any company large or small. Standard practice is to use Facebook's advertising system which does not reveal this data.
2. The wrongdoing wasn't just "breaching the terms of service" it was the transfer of account data from one party to another for the purposes of influencing an election.
3. It is not innuendo.
Only apps have limited access to the data that you agree to share in the app install dialog.
The article you linked does not even mention any of the companies.
You should read this:
https://www.theguardian.com/news/2018/mar/17/data-war-whistl...
Three particularly important points:
The guy who was contracted by CA to steal the data is from Russia.
He had previously undisclosed funding from the Russian government.
CA later tried to do business from a Russian oligarch.
Choice quote:
"There are other dramatic documents in Wylie’s stash, including a pitch made by Cambridge Analytica to Lukoil, Russia’s second biggest oil producer. In an email dated 17 July 2014, about the US presidential primaries, Nix wrote to Wylie: “We have been asked to write a memo to Lukoil (the Russian oil and gas company) to explain to them how our services are going to apply to the petroleum business. Nix said that “they understand behavioural microtargeting in the context of elections” but that they were “failing to make the connection between voters and their consumers”. The work, he said, would be “shared with the CEO of the business”, a former Soviet oil minister and associate of Putin, Vagit Alekperov.
“It didn’t make any sense to me,” says Wylie. “I didn’t understand either the email or the pitch presentation we did. Why would a Russian oil company want to target information on American voters?”"
Sources on this?
Not hard to find information, although it is typically couched as brand reach and advertising spots. The big data sales side of FB and other sites is more sensitive and less overt
https://www.facebook.com/help/494750870625830?helpref=uf_per...
https://www.theguardian.com/technology/2018/mar/17/facebook-...
It was extremely attractive. It could also be deemed illicit, primarily because Kogan did not have permission to collect or use data for commercial purposes. His permission from Facebook to harvest profiles in large quantities was specifically restricted to academic use. And although the company at the time allowed apps to collect friend data, it was only for use in the context of Facebook itself, to encourage interaction. Selling data on, or putting it to other purposes, – including Cambridge Analytica’s political marketing – was strictly barred.
It also appears likely the project was breaking British data protection laws, which ban sale or use of personal data without consent. That includes cases where consent is given for one purpose but data is used for another.
Yeah
So I, like, need to collect some data, lol
Sorry, can't do that
But I'm like, uh, an academic, this is for great science, see my Cambridge page here, lol
Ah, ok, just don't share it, k?
Yeah, yeah, no prb
kthxby
In a data breach, someone would have used a technical vulnerability or some other (e.g. social engineering) vulnerability of Facebook to get illegitimate access to the data.
In this case Facebook simply gave them access to the data and took their word that they won't misuse it.
Now maybe the latter situation might not be a data breach in the classical sense, but I don't see how it makes it any better for the victims. If anything it seems worse -- Facebook didn't even try to protect their data.
I mean, in the case of traditional phishing the user is tricked to provide the password by impersonating a banking site, getting their funds stolen and in the case in question, the users are tricked to provide personal information by being promised some kind of personality analysis but their data is used for political propaganda that they didn't asked for resulting in life-changing consequences du to politics.
(not sure if you meant /s)
> A data breach is a security incident in which sensitive, protected or confidential data is copied, transmitted, viewed, stolen or used by an individual unauthorized to do so.
If you redefine a concept and use it against your political enemies don't be all that surprised when they turn around and use it against your political allies.
> This wasn't a data breach
Yes it was.
> it was a misuse of data by a third party.
So a bank robber who gets into the vault just misused the locks? Or the security guard misused his eyes? This was a data breach. Your language makes it sound less serious than it is, and you are wrong. This was a data breach.
Edit: Less than 30 seconds in, this post is already downvoted. I won't complain about downvotes of course, but it's insane that no conversation is actually allowed to happen on this site without burying one side. I spoke with a neutral tone, didn't do any name calling, I'm not looking for a fight. But downvotes within seconds! You can't silence me HN. I'll keep commenting my opinions and facts no matter how much you don't like what I'm saying. Nitpicking breach or misuse is silly and distracts from the actual substance of the article. It was a breach, by the way.
Seriously?
I'm no fan of any politician (or really of "tyranny of the majority") but I was kind of impressed at how well it all worked out last time. An "unpopular" candidate won and the ruling elite turned over the reigns just like they're supposed to do trusting the checks and balances in the system to work.
The quickest way to get a dictatorship is to go against the legal results of an election because the unpopular candidate won based on some metric of "unpopular" like "kids rioting in the streets".
The case is more like the bank manager allowing the robber into the vault, with full knowledge that the robber wants to and could easily make off with all the valuables, then asking the robber to please not do that before heading back to work, leaving the thief unattended in the vault.
Edit: it's a breach of contract, maybe; but not a "data breach" which I think everyone understands to be more like the case where the vault is forcibly broken into.
If we keep consuming news like this, and do nothing, it's going to scalate massively. Same way as when Snowden told people they were spyed on and they collectively shrugged and continued with their lives as if nothing had happened.
We, people in tech, have a massive moral burden to educate 'normals' on the meaning of news like this!
Remember that Facebook gives you zero access to users’ data just for being an advertiser. This scheme relied on users granting access to an app.
Data access by apps was curtailed two or three years ago to no longer include friends’ data. The permissions dialog has also become far more granular. From my observation, apps seem to mostly respect facebook’s rules on data scarcity, i. e. asking only for the data they actually need.
GDPR will enshrine this principle in law at least for European citizen, and it’s somewhat likely that it will have some effect far beyond the borders of Europe.
Regarding elections, first steps will likely align the law with that for TV advertisement. Clear information about an ad’s sponsor should be required, as well as the selectors used to target you. I’ve also heard some chatter about requiring a public repository for all ads. Right now, there might be waves of, for example, racists ads that never get reported in the news because the targeting never hits those people that would consider the ad problematic. The Atlantic is running a pilot program with a chrome extensions that records all advertisement you see on Facebook for such a repository.
In the current political climate, it’s unfortunately unlikely that the US will lead with new regulation. But there are a few decent agencies in the US that can squeeze a lot of mileage out of laws already on the books (the special prosecutor, and even the FEC). Social media companies are also quite scared, both because they fear a hit to their business, and because most of their excecutive do retain some humanity. You can also expect individual European companies to get out the big guns, seeing Trump and other Russia-backed populists rattling the core of the current consensus on liberal, open, civil societies.
As an advertiser you can target users based on very specific details, and track any user that responds to your advertisement campaign.
You get all the access (AKA, you know which part of your customer base resulted from targeted ad campaigns, so anything you can target on, you can attribute to that subset).
Regarding elections and digital platforms, a Dutch policy advisory states that the government should disallow non-transparent political advertisements (all advertisements should clearly state who sponsored them and to promote which political cause, if any), and to ensure that political parties can not hide their trade-offs, they shouldn't be able to micro-target people with a different message (increase taxes vs. decrease taxes, depending on what makes the user more likely to vote for you).
Almost literally right now, IETF 101 is starting in London, and one of the things presented will be a series of proposals by people who claim they (or organisations they work for, the IETF is only for people, corporations can't participate they can just send people to it) have a legitimate reason to snoop on TLS traffic. TLS 1.3 is designed, following BCP#188, to make such snooping impossible without ongoing assistance from one of the endpoints (if the endpoint is co-operating with the snooping there's mathematically nothing anybody can do) and they would dearly like to return to an era when they could snoop with just a little one time assistance. Now, maybe this would have been stiffly resisted anyway, but BCP#188 means anybody who isn't sure has an existing IETF document telling them exactly why this is a terrible idea.
This will not be the end, and has been like this from the very beginning. If foreign companies can get access to this information, then intelligence agencies certainly can too.
We need to find a new way to communicate before this cancer becomes so widespread that the last bastillions are lost.
People still socialize and discuss issues in the real world. Having a Facebook group for a church or neighborhood doesn't preclude anyone from going to church or physically interacting with their neighbors. People also still take collective action in the real world - Antifa, BLM and the Tea Party are three modern examples, but there are countless others which simply don't get media attention.
And, all else aside, social media is still perfectly adequate for enabling communication between most people.
I'm sorry, but your comment seems more rooted in hyperbole than reality.
This is increasingly true. Actions are increasingly recorded. Privacy is increasingly undermined. We have a big problem on our hands.
Have email, chat, forums, physical letters, meetings, etc gone away for some reason?
It sounds like they never had full access to the Facebook profiles beyond the 270k who installed the app, but just harvested the friend lists of those 270k. This doesn't give the app developer full access to the friends' profile data, but I guess once you have the network of friend connections you can use other public data sources to fill in or infer the gaps. And of course some of those 50M will have FB profiles that are fully public open books ready for anyone to harvest.
I will say as someone who has developed Facebook apps, the whole ecosystem is pretty much on the honor system for protecting user data. There are some seemingly random and capricious (and often erroneous) abuse detection algorithms, but once an app has access to user data who knows what they do with it and whether it was kept secure -- surely Facebook has no idea unless they perform invasive manual physical audits.
There has never been substantial control on profile data harvest on fb. It was whatever you could get users to okay, which was a lot given the value your app had to appear to provide.
That's completely speculative, and we don't need more speculative information ... I'd much prefer to wait for evidence.
One of the worst things Facebook did was to just destroy any expectation of privacy.
Require the user to "connect with Facebook" to see their result. Give them the result, but quietly siphon off every bit of data you can with the access token.
I still run several games on Facebook platform. It’s much easier to acquire and retain users than on mobile and it’s much more profitable because there seems to be a higher propensity for users to pay.
The real scandal is that such data is so easily harvested and freely available.
I'd be interested in seeing how much of facebook's data repository was used in targeted political ads by all parties. Including Russian agitators who have been shown playing both sides.
So, no: “They are all the same” isn’t just cynical and useless. It’s also wrong.
>They are all the same” isn’t just cynical and useless. It’s also wrong.
Please do not put words in my mouth. All I am asking for is journalistic integrity. Media in the U.S. has proven repeatedly to be partisan, which, to a rational person, makes it very difficult to separate fact from propaganda. This article is a case in my point.
Unethical politicking is not an excuse for spread of misinformation.
https://rationalwiki.org/wiki/Gish_Gallop
https://en.wikipedia.org/wiki/Argumentum_ad_populum
I'd argue that this article pretty well encapsulates all of the various "scandals" the Trump administration is being bombarded with: breathlessly exaggerated so that people whose mind is already made up can scan over it and add another tickmark to their list of "scandals"
This is part of a consistent pattern. Our media has become as hopelessly partisan as our unfortunate two party system, and unethical behavior on one front does not justify the same on another in response.
Users don't comprehend what permissions they are giving to apps they run. A quiz site getting full access is not surprising.
Once an app has any amount of access the only thing stopping them from harvesting their own clone of your data is an agreement in the ToS that you won't store PII for more than x hours.
These rules are like the bare minimum to stop good actors. If you're a bad actor fb does not do a single thing to protect users from you. As evident in this report fb is also not above blaming the users for the hostile environment fb created and placed them in.
There must be countless copies of harvested fb data out there. My employer at the time once realized we were accidentally storing some PII permanently in a derived field. If good actors can't even keep above the law what do you think the ecosystem looks like in the shadows?
IMO we aren't having the right conversation with fb over how they mistreat our PII and we should loosen the definition of that term when companies like the one in the article can infer our political preferences from the innocuous bits of our lives we tag on facebook.
We should be asking why even an authorized API that can't stop you from copying the data doesn't count as a systemetized data breach.
Is your argument that no company should offer any developer APIs at all? It's impossible to stop apps from storing data that they have access to, given malicious intent.
This is like saying that the existence of the Google Calendar API is a "systemetized data breach" because an app could copy data from it once authorized by a user.
FB provides since ~10 years widgets for showing who else is liking xy. I know these Social Widgets are not so customizable and thus not pretty enough to match some custom design but at least they provide some safety nets.
Maybe Facebook could just provide more Social Widgets/CSS customizability instead of letting people write their own "Facebook Extensions".
This puts Google on the wrong side of the line, wherever it is, next to other big offenders - fb, twitter, linkedin.
To waffle less, I would absolutely be very cautious with who you give access to your gcal. You can tell a lot about a person knowing their schedule, who they meet with, where they meet, when they fly, etc. Lots on a calendar
But yes, you are right that I’m sure lots of apps kept that data and sold it.
You people should pick your battles. It would help if you knew the battlefield first.
I am so glad you know more than the UK, EU, US etc governments who have identified Russia as the primary source of instability for elections.
And since when has this been an either/or scenario. You can focus on both Russia and China.
You really think governments wouldn't have checked this ?
As far as I can tell, there is no data breach, right? It sounds like CA got facebook data through an app they wrote, thisisyourdigitallife, which did some shady things.
Also, "The New York Times is reporting that copies of the data harvested for Cambridge Analytica could still be found online".
The link is: https://www.nytimes.com/2018/03/17/us/politics/cambridge-ana...
Anyone know what they're talking about? I haven't heard of any 50-million-profile data dump, and I really like collecting corpora...
Basically FB gave the data away. Apps have access to the data but they're not allowed to give/sell it to third parties. In this case the rules were ignored. Probably many other companies with API access have also ignored the rules. In this case FB didn't make much of an effort at all to prevent it from happening so it's reasonable to assume the practice is rampant. There's likely many copies of large parts of FB data out there (left on laptops on trains or on unprotected FTP/HTTP servers, etc.).
It's a 'breach' from the users' perspective.
This is exactly how Facebook was designed. You get a stupid quiz or photo frame in exchange for a copy of your friends list. It's always worked that way, and it's why Facebook OAuth was more popular than Google+ and other Oauth since 5+ years ago -- because app devs can make more money from Facebook OAuth since it comes with a copy of your friends list, so they prefer to integrate Facebook.
The /friends endpoint only returns friends of the user who have also already installed your application.
Also, do you genuinely not find it disconcerting that Facebook leadership go to great lengths to avoid discussing the privacy implications of their service? And the only person in that group who puts herself "out there", so to speak, is instead writing "success literature"?
At the time, more than 50 million profiles represented around a third of active North American Facebook users, and nearly a quarter of potential US voters. At the time, more than 50 million profiles represented around
a third of active North American Facebook users, and nearly
a quarter of potential US voters.It's actually far easier to create ads targeted at segments with likely political beliefs, and Marketers have access to aggregate numbers of niche segments today.
There's no need to scrape people's profiles or get down to the individual level.
My original comment was more in response to user vs segment level targeting.
Maybe that's legally actionable.
What I didn't understand is why Facebook would grant this - maybe at some point they needed viral apps on the platform and giving user data away encouraged people to make them - but why did it still work a few years ago? But this article made it click: all you can really do to monetise or use millions of profiles of Facebook users is target them with ads, and Facebook is the only place you can target those ads effectively given Facebook user data, and the more data you have the more effective those ads are, the more you pay Facebook.
Facebook don't sell user data, they've long said that - and it's true. They sell the ability to target advertising to their users, and you can do that a whole lot better if you have their user data. So they don't sell it, they give an API for their users to freely give it away, knowing that once you've done all your analysis on it you'll conclude that you should spend money paying Facebook to actually deliver your messages to those users.
This comment is from the “Duped” article that has a different headline and more detail.
Interesting side note .. in Australia we assign school funding based on the highest education received or wage class of the parent (classes A, B ... E or such).