During a crisis like the one we are experiencing, being able to use available data to derive insights that help businesses thrive, or even survive, is vital. Over the past few months, we have been using open source data to enhance our clients’ customer and prospect insight. We will explore some examples in this article, as well as focusing on the risk factors associated with COVID-19, and explaining how understanding these and how they apply to your customers can help you re-engage with consumers and staff.
The last few months have been a challenge for everyone, both personally and professionally. Some businesses have thrived, but many have suffered. It leaves a lot of questions for the future. But there is data available that can help businesses solve the challenges they are facing. At present, the biggest are COVID-related, so we’ll start with those, but first we’ll take a quick look at the data that is helping to make this possible.
Open source data magic
As far back as 2017, before GDPR was introduced, we were working with open source datasets, using the vast quantities of data they provide to enhance what we know about consumers and where they live. Primarily this has been employed to create postcode level datasets that can be used either in isolation, or combined with individual level data.
There is plenty of amazing open source data available, such as:
- Transport For London’s daily journey data
- Doctor’s surgeries’ prescription data
- Government petition data
We take this data and use a modeling technique called disaggregation to spread the data from a higher geography – e.g. a doctor’s surgery or parliamentary constituency – to a lower level, generally postcode level. That means the data is at a low enough level to be useful and it can also be appended to any customer record where we have an address.
Wealth and health
In practice, what this means is that we can rank the UK by wealth and health based on some open source data at postcode level. We can then overlay customer data on this to see where the hotspots are. This can then be used to help target consumers, based on some easily obtainable data and some clever maths.
As an example, this data can be used to help tailor online content in advertising or on a website at a regional, local or even individual level. Alternatively, it can be used to help prioritise postcodes or sectors for partially addressed mail or door-drops.
Geographic analysis
Another good example is a piece of work we did for a major UK charity. The challenge here was much bigger: we were looking to determine what factors impact giving, and in particular, whether the presence of the services the charity provides impact how much money the charity makes. This was a big project with a lot of research, but the simple answer is yes.
In doing the work, a lot of smaller, more interesting pieces of information also came to light. Some of them, for example, related to charity shops: we found that up to six shops in an area was good in terms of income but in areas with more than six, the income dropped.
We found that having a shop in a local area increased legacy income. We also found that closeness to a shop increased income but with a long tail – i.e. there are richer areas outside towns where people come into town and donate to shops. This makes sense but without the analysis, and the underlying geographic data, we would not have known that.
COVID-19 models
The next few months will be dictated by how we pull ourselves out of the current coronavirus-impacted situation. Using the same sources – e.g. open source data – and marrying this to the wealth of data that ONS, health services and the government are releasing about COVID-19 and its impact, we have developed some models which are associated with the impact on people from the lockdown, including loss of jobs, inability to travel and so on.
The models are built using open source data disaggregated to different geographies. Fourteen models have been created to date, to try and understand the different risk factors associated with COVID-19. The risk factors are then correlated with the infection rate to analyse the association between the two. The data series for the specific risk measures are ranked by percentile, with 1 being lowest risk and 100 highest risk.
The risks include Age and Household Risk (larger households with older people are more at risk); Economic Resilience (based on job types, likelihood of losing a job or income); Engagement Risk (do you listen to the government advice and engage with it); and Travel Risk; as well infection rate, death rate and r-rate modelled to the lower geographies.
There are a number of uses for the datasets that have been created:
- Screening to de-select vulnerable consumers from campaigns
- Attaching codes to inbound contact data to understand the consumer on the phone
- Adding the data to models to ensure these factors are being taken into account when selecting consumers for campaigns
- Screening staff to understand risk factors ahead of any lockdown restrictions being lifted.
The data is available at a range of different geographic levels, and the application of each is dependent on the usage. At the lowest level, the models are created at OA (Output Area) level and can be attached to any individual or household with a postcode. My company provides ward level data, for example, for free, so why not download it and take a look.
We still believe that there is no problem that data can’t help solve. At the moment, there is a wealth of data out there to help solve the challenges we’re currently facing. With much of it freely available, the key question is how to obtain access to it and translate it into something usable and applicable. Find yourself a data partner and you’ll be on the first step to putting data to good use and improving customer engagement.