Big Data and the unknown-unknowns

26 Oct 2015

I’ve been following the rise and rise of Big Data and its associated buddies Data Analytics, Data Mining, Data Science, Data Value, etc.  In laymen’s terms, it’s basically enormous amounts of digital information computationally analysed to produce value.

Large amounts of data storage and analysis is not a new concept, however the advent of open source software technologies such as Apache Hadoop and Spark have allowed the storage and analysis of this data to occur on cheap commodity hardware, often at a fraction of what it may have cost in the past. 

So what do you store when it now costs you a fractionof what it did yesterday? Well, everything of course!  At least that’s what the pundits are telling us.

Of course it’s no good storing everything and then doing nothing with it, otherwise we run the risk of becoming the IT equivalent of one those poor sods on the UK reality series “Britain’s Biggest Hoarders” (picture a house made out of 5.25inch floppy disks).

So analysis and creating actionable insights is the name of the game and that’s where the true adventure begins.

My interest in the sport of Big Data, however, stems from the last decade working with Cisco on many different technologies, some of which never made it off the cutting room floor.

Even though the “Internet of Things” became popular in 1999, it wasn’t until fairly recently that companies were able to effectively commercialise the concept, and begin the journey towards connecting –30-50 billion devices (or things) by the year 2020.

So with this many devices collecting previously incomprehensive amounts of data about every aspect of our lives, where will it be stored and what can be done with it?

The answer to the first part is simple as the technology is catching up at an exponential rate <with the evolutions of storing examples here>. What we do with all this information however is not so easy to answer as we are faced with a knowledge paradigm: that there are known-knowns, the  things we know we know. We also acknowledge there are known-unknowns, that is to say we know there are some things we do not know. But there are also unknown-unknowns, the ones we don't know we don't know. Still following me? And it is this last component that is truly exciting.

IoT is growing in so many different areas that it seems like every day, innovators are finding more and more applications for tracking data on a wide array of devices. 

With the world’s total data now doubling approximately every two years, what does that mean for our knowledge and those elusive “unknown-unknowns”? We simply do not have the foresight to comprehend what we can achieve because we do not yet know what is truly possible.

My advice is to continue to challenge the demons of innovation. Do not accept conventional thinking and always keep an open mind to learning new things.

If education is of interest to you please be sure take a look at our many different courses across our Technology, Process and People offerings. <If Big Data interests you, we offer Microsoft 20467 - Designing Self-Service Business Intelligence and Big Data Solutions>