The Difference Between Big Data and a Lot of Data

The term “big data” has been around for a while now, but I still come across people who make the same basic mistake when someone asks them to explain what exactly it is.

The problem, as I have pointed out in the past, is due to the name. Big data was never meant to be purely about the size of the data. Right from the start, when the first attempts were made to codify the “rules” of big data, this was the case.

Gartner’s famous “3 V’s” of big data were, in fact, minted to make this very point. In addition to data volume, data velocity and variety were identified as essential to understanding how and why information could be captured, analyzed, and learned from.

So, from the beginning, big data should have more accurately been labelled “big, fast and varied data” – although of course that doesn’t sound so catchy!

So, the problem is this: When clients approach me to work with them, they often say, “We already do big data.” What they mean is, they have big – often huge – datasets. However, they often will have it stored in traditional structured databases and will be used to interrogating it using SQL.

What they have is a lot of data. But that does not mean, by any stretch, that they are “doing big data.”

Related Stories

5 Signs You Are a Big Data Hoarder.
Read the story »

Big Data and Market Research Myths and Missteps.
Read the story »

The Big Data Landscape Requires Community, Collaboration.
Read the story »

Redefine Big Data for Your Business.
Read the story »

“Variety” in particular is a very important element of big data. Increasingly, much more data is becoming available to us in the form of messy, “unstructured” data. This includes the millions of photographs and videos uploaded to social media and the wider Internet, or captured on cameras and closed-circuit television in commercial or industrial settings. This data contains tremendous amounts of value to marketers or anyone who wants to understand the behaviour of people in a particular environment. After all, a picture paints 1,000 words – but only if we know how to read them.

It is combining this sort of new, messy, and exciting data with the traditional business analytics we have always carried out that makes “big data.” Not simply analyzing terabytes of structured financial data to answer simple questions such as, “What are our best-selling products and services?” While it is useful to know the answer to those kinds of questions, wouldn’t it be better to be asking, “Why are these our best selling products and services?”

A lot of data, on their own, are worthless. In fact, it’s worse than that – such data can be positively dangerous, as time and resources have to be spent storing it and keeping it safe from inappropriate eyes. And that’s even before you add in the time and resources that will be wasted if you try to do something with it without understanding what big data is all about.

When big data was emerging as a fashionable buzzword, a lot of people in business really did see it as simply a catch-all term for “a lot of data.” As a result, a lot of businesses spent a lot of time and money measuring, recording, and storing as much data as possible in the hope that, at some point, they’d work out how to glean some actionable insights from it.

These earnest but wrong-headed endeavors were so common that the phrase “data rich but insight poor” became ubiquitous among critics of the “big data revolution.” And it was absolutely a fair comment.

But in the years that have passed, those who truly have grasped the meaning beyond the unfortunate label of big data have shown that it absolutely, unquestionably is possible to generate tremendous value and growth from it, in every industry from banking, finance, and insurance to disaster relief and fighting cancer.

What all of the companies and organizations that have excelled in this field have realized right from the start is that, when it comes to data, it isn’t the size that’s important, it’s what you do with it.

The key point I want to make here is that there is a vast difference between “having a lot of data” and “doing big data.” When you have a large data set that is fast moving, ever changing, and includes unstructured data, and when you are using distributed storage and in-memory analytics, then we are talking big data!

This is why I prefer the term “smart data,” which emphasizes that thinking intelligently about what to do with your data, and how you can use it to achieve your aims, is far and away a more important element of the big data equation than the simple size.

There’s nothing at all wrong with collecting a lot of data. After all, one of the key principles of big data is that the more you record, the more accurately your sample will reflect reality when it comes to the simulations and modelling where the real value is found.

But if you are considering setting off on a big data adventure yourself, it’s important to remember that there’s far more to big data than size.

Bernard Marr is a bestselling author, keynote speaker, strategic performance consultant, and analytics, KPI, and big data guru. He helps companies to better manage, measure, report, and analyze performance. His leading-edge work with major companies, organizations, and governments across the globe makes him an acclaimed and award-winning keynote speaker, researcher, consultant, and teacher.

- See more at: http://data-informed.com/the-difference-between-big-data-and-a-lot-of-data/#sthash.4edoYckX.dpuf

The Difference Between Big Data and a Lot of Data的更多相关文章

  1. The conversion of a varchar data type to a datetime data type resulted in an out-of-range value

    刚刚有在程序中,传递一个空值至MS SQL Server数据库,这个值的数据类型为DATETIME执行时,它却发生了如标题提示的异常:The conversion of a varchar data ...

  2. 《驾驭Core Data》 第一章 Core Data概述

    <驾驭Core Data>系列教程综合了<Core Data for iOS>,<Learning Core Data for iOS>,<Core Data ...

  3. 【转】浏览器中的data类型的Url格式,data:image/png,data:image/jpeg!

    所谓"data"类型的Url格式,是在RFC2397中 提出的,目的对于一些"小"的数据,可以在网页中直接嵌入,而不是从外部文件载入.例如对于img这个Tag, ...

  4. JDBC使用MYSQL的LOAD DATA LOACAL INFILE和LOAD DATA INFILE

    MYSQL的LOAD方法都必须建立在mysql服务允许使用该命令的情况下: 开启该命令的方法: 1.在实例对应的my.cnf(windows为my.ini)中添加一行local-infile=1(默认 ...

  5. 浏览器中的data类型的Url格式,data:image/png,data:image/jpeg!(源自:http://blog.csdn.net/roadmore/article/details/38498719)

    所谓"data"类型的Url格式,是在RFC2397中 提出的,目的对于一些“小”的数据,可以在网页中直接嵌入,而不是从外部文件载入.例如对于img这个Tag,哪怕这个图片非常非常 ...

  6. data directory "/var/lib/postgres/data" has group or world access

    直接拷贝完好的data至pg目录底下,可能引起下面的错误:说data目录权限不是700.FATAL: data directory "/var/lib/postgres/data" ...

  7. axios请求拦截器(修改Data上的参数 ==>把data上的参数转为FormData)

    let instance = axios.create({ baseURL: 'http://msmtest.ishare-go.com', //请求基地址 // timeout: 3000,//请求 ...

  8. csharp: Procedure with DAO(Data Access Object) and DAL(Data Access Layer)

    sql script code: CREATE TABLE DuCardType ( CardTypeId INT IDENTITY(1,1) PRIMARY KEY, CardTypeName NV ...

  9. 《驾驭Core Data》 第二章 Core Data入门

    本文由海水的味道编译整理,请勿转载,请勿用于商业用途.    当前版本号:0.4.0 第二章 Core Data入门 本章将讲解Core Data框架中涉及的基本概念,以及一个简单的Core Data ...

随机推荐

  1. 前端基础(http协议相关篇)

    网络协议篇: 1.http请求过程 DNS解析——tcp三次握手——建立tcp连接后发起http请求——服务器响应http请求 ——浏览器得到资源——浏览器渲染 2.http报文 通用首部:可以出现在 ...

  2. JPEG图像压缩算法流程详解

    JPEG图像压缩算法流程详解 JPEG代表Joint Photographic Experts Group(联合图像专家小组).此团队创立于1986年,1992年发布了JPEG的标准而在1994年获得 ...

  3. PHP学习心得1

    php是动态网站开发的优秀语言,在学习的时候万万不能冒进.在系统的学习前,我认为不应该只是追求实现某种效果,因为即使你复制他人的代码调试成功,实现了你所期望的效果,你也不了解其中的原理,这样你很难利用 ...

  4. java.lang.NoSuchMethodError: org.hibernate.integrator.internal.IntegratorServiceImpl.<init>(Ljava/util/LinkedHashSet;Lorg/hibernate/boot/registry/classloading/spi/ClassLoaderService;)

    需要:4.3及以上的版本才能用StandardServiceRegistryBuilder() hibernate-core-4.3.11.Final.jar version:4.3 ServiceR ...

  5. iPhoneX设计尺寸和适配

    被iPhone X刷了一天屏,到下午实在受不了各种假帖.标题写着“iPhone X 适配.指南.设计稿” 内容却是发布会回顾和手机介绍.索性自己去官网找素材写一篇只针对iPhone X适配的贴子,与设 ...

  6. D-Separation(D分离)-PRML-8.22-Graphical Model 五 18 by 小军

    D-Separation(D分离)-PRML-8.22-Graphical Model 五18by 小军   一.引言 在贝叶斯网络的学习过程中,经常会遇到(D-Separation)D-分离这个概念 ...

  7. P3701 「伪模板」主席树

    题目背景 byx和手气君都非常都非常喜欢种树.有一天,他们得到了两颗奇怪的树种,于是各自取了一颗回家种树,并约定几年后比一比谁种出来的树更加牛x. 题目描述 很快,这棵树就开花结果了.byx和手气君惊 ...

  8. maven 手动执行下载

    先把命令行切换到Maven项目的根目录,比如:/d/xxxwork/java/maven-test,然后执行命令: mvn clean compile

  9. 【BZOJ5334】数学计算(线段树)

    [BZOJ5334]数学计算(线段树) 题面 BZOJ 洛谷 题解 简单的线段树模板题??? 咕咕咕. #include<iostream> #include<cstdio> ...

  10. 【模板】ISAP最大流

    题目描述 如题,给出一个网络图,以及其源点和汇点,求出其网络最大流. 输入输出格式 输入格式: 第一行包含四个正整数N.M.S.T,分别表示点的个数.有向边的个数.源点序号.汇点序号. 接下来M行每行 ...