The human mind is not built to see relationships and patterns in tabular data. If a presenter lists the numbers 1, 2, 3, 4, the viewer may see that this is a simple progression. However, if the numbers are presented a little differently, such as 1.0, 2.07, 2.98, 4.11, 4.9, the relationship is not immediately clear, although the number pattern is essentially the same. However, if both of these sequences were plotted on a chart, the viewer would immediately see the two, nearly identical, line progressions.
Image Source: China v America: how do the two countries compare?
Real-world data sets generally resemble the latter progression, with many more members. Analysts often compile two or more sets of data, and attempt to identify relationships between the data, in order to make business decisions. Effective data visualizations help analysts distill the potentially huge amounts of source data into salient results using charts and graphs.
Visualizations in Business
- Data visualization is used in many industries. The field of genetics has progressed rapidly with the advent of data acquired from sequence mining, which was analyzed for many purposes, including how the DNA sequence of a particular individual could be used to predict his or her risks of contracting cancer or some other disease. Spatial data mining has been analyzed to identify areas at significant risk for air or water pollution. Business analysts have used visualizations in all aspects of the decision process, including the following.
- Product Application – companies can analyze data to find how their customers are using their products. This can aid in feature planning for future development, in identifying key customer segments, and in identifying customers by locale. One example is the effort by a leading discount store to compile data from credit card use, store loyalty cards, and warranty registration data in order to determine such outputs as sales trends, the effectiveness of marketing campaigns, and an accurate measure of customer loyalty.
- Customer Relationship Management – companies can analyze customers by distribution channel to find which are more likely to respond to particular types of market campaigns or particular types of products. In this way, they can supplant the inefficient method of cold calling in favor of campaigns directed at a specific market channel or segment. By analyzing historical data, companies could estimate the returns on their efforts and the associated ROI of their campaigns or new product releases.
These are just a few of the powerful business applications of visualization of warehouse data.
Image Source: The Different Types of Wine
Techniques for Visualization
The following are techniques used for processing tabular data into visualizations that reveal key relationships.
- Anomaly detection – charting techniques make deviations from the norm readily apparent. Some deviations are errors that can be removed from the data set. Others are important indicators for an important business relationship. Outlier detection identifies which points should be analyzed to determine their relevance.
- Dependency modeling – often, two data sets will trend or cycle together because of some dependency. An obvious example would be inclement weather and the sale of umbrellas. Other, less obvious relationships can be uncovered by dependency modeling. Companies can monitor accessible factors (such as the weather) to predict less apparent factors such as sales of a certain kind of product. On charts, a positive relationship between two data sets will appear roughly as a line.
- Clustering – as data sets are charted, analysts can uncover a tendency for data points to cluster into groups. This can uncover data relationships in a similar fashion to dependency modeling, but for discrete variables.
- Data Classification – a way to use parametric data to classify entities, similar to clustering. For instance, an insurance company can use lifestyle data for a client to determine if the client is either high-risk or low –risk.
- Data Regression – can take two or more data sets and determine the level of dependency and an equation for matching. This is the mathematical equivalent of dependency modeling. Regression can determine an equation for the line and calculate how closely the data points match the line (how fuzzy the line would appear on the chart).
Business Intelligence (BI) software is used to automate these types of techniques. But the most important takeaway is that effective data visualizations can help companies sift through large amounts of data, in order to identify important relationships and aid in business decision making.