Big Data: It’s About the Quality, Not The Quantity

dat-qualityFrom small chains to massive corporations, in both the public and private sectors, there seems to be some sort of tacit race to space in regards to data discovery and big data. The old adage, “You can’t take it with you when you go” doesn’t apply here—companies that merge, go bankrupt, or otherwise vaporize have still gathered big data that will become useful for the next guy on deck. But having a warehouse of big data just to be able to say, “Come check out how much data I have” isn’t the objective.

If you’re in the practice of collecting big data, here’s some food for thought: even the highest level data analysts and scientists have not come to one succinct conclusion on the definition of the phrase “big data.” So maybe the thing to do is to stop postulating and gathering at lightening speed. Maybe we should be organizing and communicating in the best possible ways to determine a few things. Does the data we’re collecting tell us anything substantial about:

  • Sales
  • Inventory
  • Consumer behavior
  • Geo-specific information about POS volume
  • ROI on marketing efforts
  • Employee scheduling, benefits, behaviors

When we are collecting data about these important subsets, what are we doing to ensure that there is no overlap in information that could lead to inaccurate conclusions? Let’s take the employee schedule, for example. Let’s say that across a 10-store franchise, there are 100 employees. On the whole, each employee works at one location, but one employee, we’ll call him Joe, works at three different locations. Without taking this into account, the data may conclude that Joe is working too few hours to continue to qualify for his healthcare benefits through the company. In another erroneous scenario, the data could conclude that there are three Joes, causing payroll to spit out three smaller checks for Joe, which will ultimately cost him more in taxes, and just generally make his life harder while also causing issues for bookkeeping if the taxman ever comes to call for an audit.


For the 10-store chain and for Joe, the outcome is not good, no matter how much data—big or small—is collected in an ineffective manner about Joe’s hours, his benefits, his sick days, paid vacation days, and so on.

Recommended for YouWebcast: Why Sales Enablement Should Be a Priority for You: Increase Sales Quota Attainment by 50%

In another scenario, let’s imagine there are two Joes, both happen to have the same last name. The first Joe Smith is a few months from retiring, and the second Joe is fresh out of college in his first management position. By collecting data about the Joe Smiths, the analysis is compromised when it doesn’t catch the difference between Joe one and Joe two. This could lead to Joe one not receiving his pension on time, or Joe two accidentally being paid Joe one’s salary—far more than he’s making as a green manager in his salad days.

These are just two examples of big data going wrong on a relatively small scale, but when you take the scenario of the Joes and extrapolate the inefficiencies it represents, it becomes easier to see how things on a much larger scale could cause catastrophic error. The best way to avoid complications like the ones listed above are to have in place systems of coding that can tell the difference between Joe one and Joe two, or on a larger scale, say a mix-up in a corporate car parts warehouse in Portland, Maine and Portland, Oregon.

There is no precise solution for big data as how it is gathered and managed across all organizations. The main component of the solution will ultimately be driven by how seriously company executives and higher-ups take accuracy in big data, and who they put in place to manage it. Pretending it’s not necessary to hire big data analysts is likely a mistake—and not having a system in place for catching mistakes made by any data analysis team is an even bigger error. The answer? Care about collecting relevant data, care about who analyzes it, and care about communicating between analysts and executives. Otherwise, all you have is a chaotic library of information that might cause more harm than good.

Image source: How to Measure and Monitor the Quality of Master Data | FiNETIK – Asia and Latin America – Market News Network

  Discuss This Article

Comments: 2

  • As Robertson states, there is absolutely no point in gathering vast quantities of data if you are not harvesting it for business strategic insight. Organisations should be focusing on utilising the readily available data they already possess more thoroughly, imaginatively and effectively before investing in the technologically and culturally complex domain of ‘big data’.

    Yet the truth is that very few organisations are data-driven; most are opinion-operated and therefore lack both the capability and – critically – the culture to generate such insight from Big Data. Many organisations struggle to maintain the quality of the ‘small data’ that their existing IT systems make available to them every day. It would therefore make sense to learn, implement and benefit from well-defined, but rarely practised data management strategies. The answer therefore doesn’t lie in collecting a lot of data; it’s about having the right data and getting the most value from that data. And, to make sense of and gain value from data, businesses must hold the belief that rich, well-organised data enables strategic vision into business operations.

    However, rather than employing teams of data analysts, the swiftest and most accessible route into data analysis and insight is the emerging discipline of visual analytics, which utilises our visual perception system and its innate pattern discovery. In this scenario no-one needs a degree in statistics or mathematics to be an effective visual analyst. The best of the current breed of visual analytics software encourages exploration of data, of all forms and sizes, and enables simple, effective communication of insights discovered to non-technical audiences.

    It’s analytics – and visual analytics in particular – rather than Big Data, which powers the transition from “opinion-operated” to actively “data-driven” and delivers insight, understanding and action.

    Guy Cuthbert,
    Managing Director,
    Atheon Analytics

  • Alan Lucaz says:

    I agree with Nathan (and Guy’s subsequent comment.

    Big Data is about being open to gathering and analyzing vast amounts of heretofore unavailable data. Some businesses will absolutely see no value in implementing a Big Data initiative, but many will to varying degrees.

    I just saw a video on Youtube that your readers might find relevant ( The sci-fi inspired video is based on IT industry research into Big Data initiatives gathered by IT services provider, TEKsystems. Not only does it break down the preparations and planning necessary to start a Big Data project, it does so by multiple classic sci-fi references.

Add a New Comment

Thank you for adding to the conversation!

Our comments are moderated. Your comment may not appear immediately.