OK, so you’ve got data spread across a number of cloud applications and you have decided to bring that data together to understand your business better. Great. But you have customer information over here and order information over there, invoices in this system, and inventory in that one. When it comes to lining up the data, how does that work, really? In part one of this article, I will describe how to go about matching data across cloud apps. Be sure to read part two on data schema challenges related to getting the data you need from a particular app.

What you have is a schema problem. Each cloud app defines it’s own schema — a format for categorizing and storing data, sort of like the rows and columns in a spreadsheet. Your Customer Relationship Management (CRM) system will have detail about your customers organized one way, and your billing system will have lots of information about your invoices organized another. In order to bring that together, you need to match the customers in the one with the bills in the other.

In database parlance, this matching process is called a join. It would be relatively easy to join your data if everything had a standard format, but when you use disparate cloud apps, each app has its own way of doing things. You need to decide how to map data from one schema onto another. Where to begin?

You can start by identifying the things that matter most to you — often called entities. Usually there are relatively few fundamental entities: customers, products, employees (etc).

Second, you should determine which of your apps “owns” each entity — for example, Customers belong to your CRM system, Products belong to your Inventory Management System (IMS), and Employees belong to your Human Resources (HR) system. We sometimes call the owner of record the master file for an entity.

Finally, you standardize your references to each entity. For example, while your accounting system will have information about customers, you may have determined that your CRM is the customer master file, so you want everything to point back to the CRM record of each customer. Typically, this means storing the ID from your CRM somewhere in your accounting system as a reference. Although storing these references may require uploading a file or even manually copying and pasting an ID from one system into another, the IDs should not change and you only have to create references when you add new entities (customers, products, etc).

A few key references across your cloud apps and a good integration system are all you need to start bringing all of that rich data together so you can gain new insights on your business. And all it takes is one or two key insights to make the effort you put into standardizing references really pay off.