The Data Director on the Sanders campaign discovered the error and (he claims) was verifying and documenting the bug, which was then reported to the Democratic National Committee (DNC) and NGP VAN. The DNC claims these actions were not in good faith, and as a reaction cut the Sanders campaign off from the system.
This is a BIG deal for a campaign, so close to the first elections. Campaigns rely on that data to inform nearly everything they do, and rely on access to such tools to conduct their voter outreach program. Being cut off from the system is crippling for a campaign, likely why the Sanders campaign so quickly sued to get its access reinstated [1].
[1] - http://www.politico.com/story/2015/12/sanders-campaign-threa...
edit: typos
"The database logs created by NGP VAN show that four accounts associated with the Sanders team took advantage of the Wednesday morning breach. Staffers conducted searches that would be especially advantageous to the campaign, including lists of its likeliest supporters in 10 early voting states, including Iowa and New Hampshire. Campaigns rent access to a master file of DNC voter information from the party, and update the files with their own data culled from field work and other investments. After one Sanders account gained access to the Clinton data, the audits show, that user began sharing permissions with other Sanders users. The staffers who secured access to the Clinton data included Uretsky and his deputy, Russell Drapkin. The two other usernames that viewed Clinton information were “talani" and "csmith_bernie," created by Uretsky's account after the breach began. The logs show that the Vermont senator’s team created at least 24 lists during the 40-minute breach, which started at 10:40 a.m., and saved those lists to their personal folders. The Sanders searches included New Hampshire lists related to likely voters, "HFA Turnout 60-100" and "HFA Support 50-100," that were conducted and saved by Uretsky. Drapkin's account searched for and saved lists including less likely Clinton voters, "HFA Support <30" in Iowa, and "HFA Turnout 30-70"' in New Hampshire. Despite audit logs, Weaver said at the news conference that NGP VAN has told the campaign that no Clinton data was printed or downloaded."
http://www.bloomberg.com/politics/articles/2015-12-18/sander...
It demonstrates the ability of the Sanders campaign to access the Clinton data without actually having the ability to use it once the breach was sealed, which, like the previous breach, it would inevitably be.
It's like making a copy of the personnel files left in the mailroom and sticking them in your mailbox. Lets you demonstrate they got left out in case VAN tries to say the breach wasn't serious.
The phrasing here strikes me as somewhat vague. Are they implying that Weaver's statements are in conflict with the audit logs, or are they (somewhat ineffectively) implying that "saving lists" merely equates to bookmarking a certain query?
NGP stated:
"So for voters that a user already had access to, that user was able to search by and view (but not export or save or act on) some attributes that came from another campaign."
What exactly do they mean by "view", let alone "act on"? If someone was truly dedicated to extracting data through their browser, are the terms truly mutually exclusive?
What I'm surprised about is that the campaigns are willing to let this data be stored in the cloud on shared systems. I would have expected all proprietary data to be stored locally by each campaign on private in-house servers, probably with periodic data dumps of updates from the data provider.
Why put forth the expense of obtaining (purchase or rent) hardware and staff to maintain that hardware? Additionally, why put forth the time and expense to write or compose a CRM-like software solution that integrates with voter data, what sounds like a dialer/call center, and "big data" tools (Spark, Hadoop, Tableau, SSIS/SSRS) that probably needs a good 6 months lead time before the candidate even announces a run for office? Also, why would every potential candidate do this every 4 years?
Sounds like a perfect choice for a hosted solution that can be iterated on outside of the election cycle.
Private in-house servers are very expensive to set up and maintain. Nearly everyone stores vital personal information on someone else's servers.
Not the campaigns...the parties.
The problems listed below are pretty exact: huge data sets, lots of cleaning and normalizing, and the snail mail/cd problem is real. Additionally, I'd note that ~40% of the states [somehow] charge for the data...it takes six digits to get a snapshot of all 50 states - and certain states (looking at you FL) say that they do not store the historical, meaning you have to connect with the local BoE's to aggregate the data.
A part of me [now] wants to open source this because of the DNC's actions.
Even then, you only have a snapshot, because the states typically don't keep historical data. What this means is that your dataset won't be as good as someone who's been collecting this data for years, and thereby knows things you won't like where someone used to live, how often they voted there, who recently dropped off the registered voter rolls, etc.
In this case, even this data wouldn't be enough, because the Sanders team had made likely hundreds of thousands of contacts with voters, and recorded what issues they cared about and who they planned to vote for. This data, which they personally collected, is now inaccessible to them.
edit: expounded
Nationbuilder, a sorta-competitor to NGP, has put together a national voter file and it's reasonably priced. https://elections.nationbuilder.com/about/faq
The DNC/NGP voter file, however, is significantly enhanced - for one, it's got a lot of phone numbers, which most states don't include in their lists. There's a lot of other survey and consumer data associated.
For a national voter file that's good enough to use for say a City Council race anywhere in the country, where all you want to know is "who is likely to vote in this non-presidential election", it's doable but very expensive. For anything serious, it's pretty much outside of capabilities of anyone but the parties and some of the very big SuperPACs/very big orgs.
The Sanders campaign had reported a different issue with a different vendor's software in the past.
EDIT: Actually on seconding reading the Sander's lockout was not for security reasons and was only done by the DNC in awaiting full details from the campaign. In that instance it wouldn't make sense to suspend any other campaign's access. They are punishing the Sanders campaign in hopes that it causes a quick confession of the exact details of what data the campaign accessed and retained. I still don't think that response is as unreasonable as some Sander supporters are alleging.
"That is just like if you walked into someone's home when the door was unlocked and took things that don't belong to you in order to use them for your own benefit."
Essentially, "gray-hat" hacking isn't always seen as a friendly warning to the vulnerable as much as it might be an attack. One has to wonder, if one could draw a physical parallel between a trespassing and gray-hat style hack, if you did enter someones house, take their gold watch from their bedroom, then walk down to you sitting at your breakfast table and tap you on the shoulder, and then say, "Hey bro, your door was open, and you didn't even secure your jewelry in a safe with a key in case someone did break in. I did this to demonstrate your house's vulnerabilities, you should be grateful! May be even give me a little something for my troubles..."
Of course, the parallel might not be fair, since one can't draw a parallel between a private house and a server with a public facing access point to sensitive material, so the closest proxy I can think of is a bank. Still, a similar parable can be drawn here: You rob a bank without tripping alarms and hand the manager $30000 of stolen money, and claim you did it to warn him/er of issues with the vault's security. In that case, it's plausible to assume s/he might not be that receptive.
I think it's great that penetration testers and people of the like are very willing to do the hard work of finding holes in security systems--and not use it for nefarious purposes, but actually disclose it to companies so that they can holster their systems--but how exactly is the hacked party supposed to take it?
[0]http://www.cnn.com/2015/12/18/politics/bernie-sanders-campai...
NGP-VAN is crap hack software anyways.
Sure it's possible that the Sanders campaign did exploit this and the Clinton campaign did not. But I'm skeptical as hell given the political allegiances of the company's leadership.
Maybe it really is nothing more than "Oh, I've heard of pepsi, so I'd better buy a fucking ton of blue-labeled sugar water every week of my life", but there might be something else evolutionary about power and loyalty and reward.
As was pointed out in this reddit thread [1],The CEO of NPG VAN (Stu Trevelyan) is a strong supporter of Hillary Clinton and worked on the 1992 Clinton-Gore "War Room," and then in the Clinton White House [2].
[1] https://www.reddit.com/r/technology/comments/3xbt3w/bernie_s...
Hardly getting any blame is a neat trick. I wish I had that luxury.
See also: "goto fail;"
[1] - http://www.bloomberg.com/politics/articles/2015-12-18/sander...
[2] - http://heavy.com/news/2015/12/josh-uretsky-bernie-sanders-ca...
edit: added source
The media going to have a field day with this.