Daily Archives: October 4, 2013

Defining “Big Data”

Given the confusing and varied interpretations of Big Data, couple of academics from University of St. Andrews, conducted a meta-analysis of extant definitions in a recent paper Undefined By Data: A Survey of Big Data Definitions.

1) Gartner:  three fold definition encompassing the “three Vs”: Volume, Velocity, Variety.

2) Oracle:  Derivation of value from traditional relational database augmented with new unstructured data sources.

3) Intel:  Links big data to organizations generating a median of 300TB of data weekly.

4) Microsoft: Process of applying serious computation power – latest in machine learning and AI – to seriously massive and complex sets of information.

5) Method for an Integrated Knowledge Environment project:  The high degree of permutations and interactions within a data set defines big data.

6) National Institute of Standards and Technology.  Data which exceed(s) the capacity or capability of current or conventional methods and systems.

The authors then attempt to coalesce these definitions and venture a new one:  Big data is a term describing the storage and analysis of large and or complex data sets using a series of techniques including, but not limited to:  NoSQL, MapReduce and machine learning.

                                                       *******************************

Instead of trying to define Big Data (a pointless exercise), focusing on what it can do (the value to businesses, consumers, and governments) is a more fruitful path to pursue.