Big data (part 2)Posted: February 4, 2015
February 4, 2015
A recent column in Forbes kicked off a discussion with a friend of mine, Tim Powell. We’ve both been looking at “big data”, and how it not only fails to contribute to, but may even inhibit competitive intelligence and other analyses.
Let me start with the column:
It points out that sports writers have been made more timid in their predictions because of the availability of increased data, sports’ big data. The reason given is that the sports writers are concerned about having their own opinions thrown back at them when the data is later analyzed and the writers’ prediction turns out to be wrong.
The article pointed out an analogous situation: where doctors might elect for a diagnosis and treatment that is more data driven, fearing future lawsuits, than one based on their experience and related highly trained instincts.
Rich Karlgaard, the column author, also pointed to the case of Starbucks in the early 1990s, which had hit a “slow patch”. Management there was looking to get more data to figure out what was wrong, but the number two at Starbucks actually went to the field talking (gasp!) to employees; he found the problem was one of attitude among new employees as well as among older employees. As the column concludes, “Trust your eyes and ears. The data are your tools not your master.”
That is all true, but the problems of dealing with big data can often be traced to a fundamental flaw that can be expressed this way:
We only deal with that which we can measure or already have measured (so Starbucks’ executives never would have talked to employees, but simply would have collected more irrelevant data). That in turn means, we measure only what we have or what we can get, instead of seeking to determine first what it is that we need – turning the analytical process on its head.
This I think is one of the major problems that we in competitive intelligence and others face in dealing with the world of big data. Not all data is quantitative or digital – some is qualitative or non-digital. But big data proponents, think the NSA, operate in a world where data is only what can be collected and stored on a computer and analyzed from there. So, in the case of the NSA, it focusses on collecting, storing and analyzing communications data – why – because it can.
The problem becomes that people collect data because they can collect it or because maybe, someday, perhaps, they might need it (ignoring the whole issue of half-life, which is a nice way saying that some data goes bad pretty quickly, as well as the problem of having too much noise in the data).
The right approach: determine what the question is before you then determine what data might be useful to help craft an answer. Today, big data seems too often driven the other way – determine what answer can be provided, and then attempt to drive the end users to produce a question that can be answered.
 This practice is indirectly challenged by the conclusions in Erik Dahl, Intelligence and Surprise Attack: Failure and Success from Pearl Harbor to 9/11 and Beyond (Georgetown Univ. Press, 2013). His tabulation of 227 terrorist plots and cases finds that human intelligence is by far, approximately 50%, the reason for the failure of the plot or effort. Interestingly, signals intelligence fall behind both overseas intelligence and unrelated law enforcement efforts as a reason for failure, accounting for approximately 10% of the cases.