In the public world, though, it seems to me like big data has had pretty huge results. The most commonly cited example probably being Nate Silver predicting the 2012 US presidential election results correctly for all 50 states using big data sources and techniques--this degree of predictive statistical analysis was previously unheard of in politics.
http://dilbert.com/strips/comic/2012-07-29/
Is there some kind of gene which predisposes people to throwing around buzzwords? How is it that human behaviour gave rise to "big data" and "the cloud"?
Ever notice how the people who throw around the most buzz words are those who think of, and refer to themselves as "big picture" thinkers? They have conveniently tricked themselves into thinking that focusing on and mastering any true skill is a distraction from understanding the broader landscape in which they operate.
It is a very tough sell to tell people that their intuitions are wrong, especially when certain practices and beliefs are long entrenched.
It is hard for me to describe to anyone the resistance to anything you present if you haven't been there. The expectations for accuracy are far beyond what can be realistically accomplished. It is humiliating and frustrating to see people who have no business analyzing your work comb over every last number and if ONE number is wrong, the whole plan tumbles.
In my experience, the communication issue is teaching sales, management, etc, that the point is not to play whack-a-mole and "fix" everything, and that the data is not meant to be used as a hammer (IMO), but that it is simply there to point the company in a direction, or at least, show where things could be improved and encourage good directions that already exist.
I believe that the expectations do not align with reality at this point. Everyone is looking for some mythical Fountain of XYZ, and it simply does not exist.
"Lies, Damn Lies, and Statistics" is so ingrained in our conscience that the expected reaction to Big Data is a knee-jerk mistrust to whatever is presented. It is a serious issue, and we that have to analyze data have to be cognizant of the fact that we are pushing back against a century (centuries?) of idiots who have used statistics to lie, justify false information, and push agendas.
The U.S. Federal Government is packed with these types.
It will also take something more than Hadoop and the like to do something real... Don't like to be that guy, but please, stop rewriting yet another query language and try to write efficient engines instead. ;)
Heck, back in high school one of the math competitions I won was sponsored by INFORMS (the Institute For Operations Research and the Management Sciences), and I asked my dad "What's operations research?" and he said that it's where people with math Ph.Ds go to make big bucks. Companies like FedEx, Safeway, and WalMart have relied on their massive amounts of operations data to do things like minimize transportation costs or ensure that they're stocking items with maximum demand for decades.
The thing is, everybody who gains a competitive advantage from big data has reason to keep that fact secret, so that their competitors don't start doing the same thing. What's changed now is that the media itself is facing competitors that use big data themselves to make themselves more relevant than traditional media, and so suddenly it's a Story Of Consequence. It's not that big data is the next big thing - it's that big data was the last big thing that you are only hearing about now, and suddenly a bunch of folks that had never heard of it are now trying to play catch-up.
The stream of data that we and our environments create(d) is an reality since the moment we made them digital (vs. analog) and started processing them with computers. As soon as you are able to process the data you are often just one switch / flag away to also store it.
When telephone systems and backbones started to become digital on a broader scale in the 1990, the at that time already existing surveillance or data collection expanded massively because data became accessible and usable on a large scale. First huge processing farms were built for the FBI, NSA ea. and the congress appropriated hundreds of millions for them. For the next version of those server farms U.S. congress already provided dozens of billions 4-5 years ago. Follow the money and you can find out since when this is going on.
With every step our world is becoming more digital these "virtual images" of us will become more complete and we will become more dependent on them. Ask yourself how often you are still using a paper map today vs 10 years ago, soon there will be a generation of people living that will only navigate with digital maps and GPS leaving a trace of their every movements - for them it will be the norm and they will not know another way. If you prefer a more positive image think about people living with a heart disease and how many of them will survive because of 24/7 surveillance.
One key element with all that data often overlooked is that once we depend on it within many areas of our lives, falsification of "data elements" or blocking access to "data services" has substantial impact on the person herself. Thinking about finding a new job so that you can pay your rent - almost every step within that is already digital. And faking email conversations, phone interviews etc today only requires limited resources for "someone" having mandatory access to the mail servers and internet infrastructure. Putting incriminating material on computers of your business / political / life "opponents" might already be a drag and click activity for some of those. Falsificating / sabotaging financial transactions have been reported from various political activists since years and are for years already part of Hollywood folklore / films. In short - soon "some" will be able to completely change our real lives by "working the data" - if we are economically successful, whom we meet, what we think about others / products / politics, if we die from diseases or not...
It would be too easy to say that what happens with all the data about us, our activities and interactions is a matter of what society we are living in, if it's a true democracy with civil rights or an oppressive state. This per se is an illusion. It is denying how the majority of people are living their lives, they want to be part of a community, be safe, have no problems, enjoy incentives (or things you can buy). And it is denying how governments work, because the sole existence of such a feedback / control / surveillance / oppression mechanism is changing society itself and the way we are governed.
Think along the lines - what can be done will be done. And if powers of "some" in our societies continue to be expanded day-after-day - it certainly will.
It's not evident that such Big Data was useful in the way described. Anyway, the data described could easily be collected - likely at the same cost and accuracy - in an old-fashioned focus group type of way.
Deciding to buy House of Cards required insight - outside of the data and statistics - to ask the right questions and come to a conclution.
Has using 'Big Data' here led to a more accurate or better value results? Is this even really what 'Big Data' is? Absolutely not clear.
'Data phobia' might be an issue but is not the major issue. The issue is using phrases like 'Big Data' which don't actually mean anything: this phrase alone doesn't create value for business.
Many people treat data as the Oracle of Delphi hoping to just make an offering (investment) and wait for knowledge to be dropped. This idea that you can just shift through endless data and pull out insights is just plain wrong. Frist start with the decisions you can or have to make, then look at the data to see if it gives you insight into the best choice.