Recently I read an article from Network World that claimed the definition of big data changes based on who you’re asking. In the article, data scientist John Rauser said that big data is “any amount of data that’s too big to be handled by one computer.” This is true, but I think there’s a lot more to the concept of big data. Big data is not a new thing. We’ve been dealing with for years – there are now just more sources and types of data to consider.
As big data has risen to the top of the agenda in 2012, there have been many definitions thrown around in an attempt to truly grasp what big data entails. One common definition is around the combination of volume, velocity and variety. Volume refers to the enormous amount of data generated and collected by organizations. Velocity means the speed at which the data must be analyzed. And, variety refers to the array of different types of data that is collected. Many analysts and reporters have claimed that the combination of these three elements produces big data.
While I agree that these three pieces are important when defining big data, I don’t believe it’s a requirement to have all three categories to get what we call “big data.” For example, if you have volume and velocity or variety and volume, etc. you are still dealing with big data. It’s the intersection of these areas that create the big data challenge – and opportunity.
When identifying sources of big data within an organization, I think it’s less about conforming to a particular definition but rather discovering what pools of data you’ve forgotten about. Are there pockets (it doesn’t always have to be massive volumes of data) of unstructured data that your company is not currently analyzing? Have you merged structured and unstructured data to create business insights? Can you pull these smaller data ‘puddles’ into one bigger pool of data to leverage in a new way to solve an old problem?
Think about utilities as an example. As more homes and business move to smart grids, the challenge in optimizing consumption is that everyone needs to know their energy consumption in real-time in order to make decisions. If every meter in a city is sending 200 characters of data from each location every 15 minutes, that’s a lot of data. Most often, CIOs end up storing this data and haven’t yet figured out how to deal with it.
As more tools and technologies become available, the barrier to gaining insights from big data is lowering. Companies need to find a way to expose those forgotten pools of data, filter out the irrelevant information and empower staff with the right analytics to make sense of it. The key is that there shouldn’t be an ivory tower of data scientists any more – business intelligence needs to be distributed throughout the company in order to gain true value.
At Avanade, we are seeing signs that the industry is shifting from a defensive to an offensive attitude in terms of how they approach and respond to big data. Next week, I’ll share key findings from Avanade’s latest global study which shows clear value from big data.