We already know that Big Data is a big deal, and it’s here to stay. In fact, 65% of companies fear that they risk becoming irrelevant or uncompetitive if they don’t embrace it. But despite the hype surrounding Big Data, companies struggle to make use of the data they collate. Read on to find out what the problems are with big data implementation.

Problems with Big Data PieSync data integration

Problems with Big Data

Pioneers are finding ways to use Big Data insights to do such things as stopping credit card fraud, anticipating and intervening hardware failures, rerouting traffic to avoid congestion, and guiding consumer spending through real-time interactions and applications.

61% of companies state that Big Data is driving revenue because it is able to deliver deep insights into customer behavior. For most businesses, this means gaining a 360° of their customers, by analyzing and integrating existing data.

A recent CapGemini report agrees, stating “Digital customer experience is all about understanding the customer, and that means harnessing all sources – not just analyzing all contacts with the organization, but also linking to external sources such as social media and commercially available data. For the digital supply chain, it is about collecting, analyzing and interpreting the data from the myriad of connected devices.”

The biggest problems facing organizations is how to get value from this data. Only 27% of the executives surveyed described their big data initiatives as successful. This indicates that there is a huge gap between the theoretical knowledge of big data and actually putting this theory into practice.

So what’s the problem?

Top 5 Big Data Problems

  1. Finding the Signal in the Noise

It’s difficult to get insights out of a huge lump of data. In order to use Big Data, Data Scientist and author of the book “Social Network Analysis for Startups”, Maksim Tsvetovat said that “There has to be a discernible signal in the noise that you can detect, and sometimes there just isn’t one. Once we’ve done our intelligence on the data, sometimes we have to come back and say we just didn’t measure this right or measured the wrong variables because there’s nothing we can detect here.” He went on to say that in its raw form, Big Data looks like a hairball and scientific approach to the data is necessary. “You approach it carefully and behave like a scientist which means if you fail at your hypothesis, you come up with a few other hypothesizes, and maybe one of them turns out to be correct.”

2. Data Silos

Data silos are basically Big Data’s kryptonite. What they do is store all of that wonderful data you’ve captured in separate, disparate units, that have nothing to do with one another and therefore no insights can be gathered from this data because it simply isn’t integrated on the back end. Data silos are the reason you have to number crunch to produce a monthly sales report. They’re the reason that C-level decisions are made at a snail’s pace. They’re the reason your sales and marketing teams simply don’t get along. They’re the reason that your customers are looking elsewhere to take their business because they don’t feel their needs are being met and a smaller, more nimble company, is offering something better. The way to eliminate data silos? Integrate your data.

3. Inaccurate Data

Not only are data silos ineffective on an operational level, they are also fertile breeding ground for the biggest data problem: inaccurate data. According to a recent report from Experian Data Quality, 75% of businesses believe their customer contact information is incorrect. If you’ve got a database full of inaccurate customer data, you might as well have no data at all. The best way to combat inaccurate data? Eliminating data silos by integrating your data.

4. Technology Moves Too Fast

Larger corporations are more prey to data silos, for such reasons as they prefer to keep their databases on-premises, and because decision making about new technologies is often slow.

One example cited in the CapGemini report is that stalwarts like telcos and utilities “…are noticing high levels of disruption from new competitors moving in from other sectors. This issue was mentioned by over 35% of respondents in each of these industries, compared with an overall average of under 25%.” In essence, traditional players are slower to move on technological advances and are finding themselves faced with serious competition from smaller companies because of this.

Big Data is also fast data. Paul Maritz, Pivotal Chief Executive Officer of the EMC Federation, wrote in a recent CapGemini Report that, “If you can obtain all the relevant data, analyze it quickly, surface actionable insights, and drive them back into operational systems, then you can affect events as they’re still unfolding. The ability to catch people or things “in the act”, and affect the outcome, can be extraordinarily important, valuable and disruptive.”

The ability to make snap decisions and quickly move on Big Data insights is the advantage SMEs have over large corporations.

5. Lack of Skilled Workers

CapGemini’s report found that 37% of companies have trouble finding skilled data analysists to make use of their data. The best bet is to form one common data analyst team for the company, either through re-skilling your current workers or recruiting new workers specialized in big data.

You need to find employees that not only understand data from a scientific perspective, but who also understand the business and its customers, and how their data findings apply directly to them.

Data Integration is Key

Data integration – or to be technical, data harmonization – is absolutely essential for getting the full advantage out of your Big Data. Data integration addresses the backend need for getting data silos to work together so you can obtain deeper insight from Big Data.

In the book Big Data Beyond The Hype, the authors Zikopoulos et al. found that “…we see too many people treat this topic as an afterthought—and that leads to security exposure, wasted resources, untrusted data, and more. We actually think that you should scope your Big Data architecture with integration and governance in mind from the very start.” (p.26)

Not only will this save the janitorial work that is inevitable when working with data silos and big data, it also helps to establish the fourth “V” – veracity. In other words, the trustworthiness of your data, which will underpin the authority of any insight you gain from analyzing your data.

Problems with Big Data PieSync

The first step to integrating your data is to ensure you’ve got clean data. Big Data Consultant Ted Clark, from the data consultancy company Adventag, said that “80% of the work Data Scientists do is cleaning up the data before they can even look at it. They’re data custodians rather than analysts. Anything you’ve done more than three times, you should automate – it might take longer the first time but the other times you will save time and focus on an analysis.”

How to Clean and Maintain your Data

  1. Remove Duplicates

If you’re using multiple channels to capture data, such as through your website, customer care center, and marketing leads, you’re running the risk of collecting duplicate information. There are tools to help you remove duplicate data. For instance, if you work with Google Contacts you can merge your contacts.

2. Verify New Data

Set company-wide standards on verifying all new, captured data before it enters the central database. Put in checks to see if the customer isn’t already in the system, or that they’re not in the system under a different name or under their email address.

3. Update Data

Keep your data updated. You can do this by using parsing tools, which scans all incoming emails and updates contact information as it comes to hand.

4. Implement Consistent Data Entry

Ensure that all employees are aware of company-wide data entry standards. For instance, each customer record has to have first and last names.

Originally published at blog.piesync.com

Read more: