Why is Big Data Such a Big Deal Now?

If you are in the tech industry, you surely have heard the term big data. But why is it that we are talking about it so much, when data (and large amounts of it) have been around for a while (i.e. more than 5 years)? We had big data systems that dealt with weather prediction, storing medical records, google searchs, inventory systems, etc.

Big Data has indeed been around, but now it is not restricted to the likes of IBM and Google, it has become cost effective for ‘anyone’ to do.

There are three things you need for big data.

1) Large amounts of Data

We generate a lot of data everyday. Because of  smartphone adoption, use of social media and proliferation of the the internet – we emit large volumes of storable data. How much?  We create as much information in two days now as we did from the dawn of man through 2003. 2 DAYS!

As an interesting flash back, take a look at this paper that talks about how much data we will have in the year 2000


2) Ability to store that Data

Ok so now you have the data source, but where do you store it? Well it just so happens that the cost of storage has also gone down significantly over time.

In 1980, to have a 1 Terabyte (1,000 Gigabytes) hard drive, which is standard on todays computers, would have cost you approx. $193 Million dollars. Now you can get that for about $100 (retail).

$500 buys a large volume of space on Amazon’s cloud. It’s the equivalent of what would have cost hundreds of thousands of dollars just 10 years ago.

Here is a history of the hard disk cost over time. http://www.mkomo.com/cost-per-gigabyte


3) Ability to process that Data

So now you have the data and you have it stored on your infinity large affordable hard drive on the cloud, now what? Now comes the ability to process it. Forgetting the fact that you need to actually know the math and queries to run (which is why Data Scientists is the new sexy job) you need the tools to actually process this. Microsoft Excel just wont cut it.

We can thank open source projects like hadoop and Druid that allow us to sift through these large amounts of data and find the answers to our questions.

Oh and that little phone in your pocket is over 15 times faster than the 1979 Cray 1 Supercomputer CPU (the ones they use to try to predict weather, a very hard and computational intensive process).  Put another way, if you transported that mobile phone back to 1987 then it would be on par with the processors in one of the fastest computers in the world of the time, the ETA 10-E, and they had to be cooled by liquid nitrogen.