Voice assistants are an increasingly integral part of modern life, and the experiences consumers are having with these digital personalities are growing in scope and significance.

If you are a marketer in 2019 you’ve seen this often-cited voice statistic: By 2020 half of all searches will be voice searches. Based on the consumer adoption trends of the past few years, we’re on track to meet this prediction. How has the voice revolution evolved, how are today’s data-driven brands already finding success, and what can well-established voice channels like phone calls teach us about interacting with consumers via voice?

Voice Is a Continually Evolving Channel

“Brand voice” suddenly has a much more literal meaning thanks to the ability to create voice experiences for consumers. Marketers who want to successfully leverage voice as a channel—and ultimately drive sales—must understand how and why people turn to voice as a part of their daily routines.

The rise in popularity of voice assistants and voice-enabled devices and the ability to create brand voice experiences are the results of a convergence between consumer preference and technological capability.

People like to communicate through speech because it’s easy. And, after all, the promise of technology has always been to make our lives easier: Easier to be connected to one another, easier to perform daily tasks, and easier to do our jobs. Consumer tech has often left users feeling like they must adapt to devices and channels instead of vice versa—until now.

Technology Has Finally Adapted to Fit User Communication Preferences

Many technological advancements have lead to the ability to build devices and programs that we can finally interact with in the way that feels most natural—speaking.

There is both a fascination and pragmatism attached to the idea of machines that can understand humans. Thanks in part to this allure, computer speech recognition (the basis of today’s voice assistants) has long been of interest to technologists.

Improvements in computing such as faster processors, better data availability and storage systems, and developments in voice recognition and natural language processing deserve credit for today’s voice technology.

For marketers, this means the voice channel is unique for building immersive consumer experiences. Consumers get to interact digitally with brands in a way that feels, dare it be said—human.

The Evolution of Voice Recognition Technology

1952: Audrey

Audrey, the first speech recognition system was built in 1952 by Bell Laboratories and programmed to understand only numbers. It could recognize strings of digits at an accuracy of up to 90%, given that the speaker was the device’s inventor speaking very, very slowly. The device recognized other speakers at an accuracy between 70-80%. This was an early indicator of the challenges presented by the fact that individuals have very different voices, speech patterns and dialects. Audrey also required substantial space, consumed large amounts of power and was prohibitively expensive to run and maintain. It was suggested that the device be used by telephone operators but it was an inferior competitor to even simple manual dialing and was never implemented for practical use.

1962: Shoebox

IBM introduced the “Shoebox,” a speech recognition system meets calculator at the 1962 World’s Fair. It could recognize 16 words including numbers and simple arithmetic terms like “plus,” “minus” and “total.” Visitors to IBM’s World’s Fair Pavillion spoke (still very slowly) to Shoebox via microphone and watched as it printed out answers to the simple math problems they instructed it to perform.

1971: Harpy

The US Department of Defense even got into the speech recognition game by funding Carnegie Mellon University’s development of “Harpy.” Harpy could understand more than 1,000 words (approximately equal to the vocabulary of an average three-year-old).

Harpy was significant because it used a technology called template matching to match the sound wave patterns of words it “heard” to the sound waves patterns of its programmed vocabulary, increasing accuracy levels.

1980s: The Hidden Markov Model

The hidden Markov model is a statistical method developed in the 1980s that allowed for a large increase in the number of spoken words computers could recognize. Instead of using the templated approach of Harpy and other prior devices, the hidden Markov model considered the probability of a sound being a word. This new approach scaled speech recognition and allowed it to be used commercially.

The hidden Markov model was even used to power a slightly creepy children’s toy known as “Julie”, the “doll that understands you.”

1990s: Dueling Dragons

Dragon Systems used the hidden Markov model for their DragonDictate speech recognition application, which powered Microsoft Windows in the early 1990s. DragonDictate still required speakers to hold long pauses in between words to assist the computer in recognizing when one word stops and another begins. In 1997 Dragon Systems released a newer, faster application called Dragon NaturallySpeaking that could recognize continuous speech of around 100 words-per-minute. However, the system still required a 45-minute training for best accuracy. Dragon NaturallySpeaking was eventually acquired by Nuance and, many versions later, is still in use today.

Early 2000s: Lack of Adoption

Speech recognition and voice commands were built into early aughts operating systems including Windows Vista and Max OS X but many users were not aware that these features existed. It was easier to use text-driven communication given the somewhat of a stagnation that had occurred in voice recognition development.

Google introduced the first personal voice recognition tool, the Google Voice Search app for iPhone, in 2008. The machine-learning powered app allowed iPhone users to perform voice queries through Google, voice search their contacts and ask questions in regards to geographic location (for example: “Where is the nearest Starbucks?”).

Google Voice Search’s introduction is a great example of the convergence of technological capability and consumer choice. Thanks to cloud data centers, Google could perform the heavy-lifting of processing large voice files more easily. Consumers also preferred to perform voice search because it was easier to use than wrangle with tiny cellphone keyboards.

Shortly after launching the Voice Search app, Google built on its first iteration by focusing more on understanding the nuances of individuals’ speech.

2011: Siri

When Siri was introduced (only eight years ago) voice recognition technology had finally evolved so much that personality was now a factor. Siri’s unique voice and quippy answers simultaneously put users at ease and intrigued them.

Although the voice assistant did receive public criticism for lack of accuracy, it still became a nearly instant companion to many iPhone owners.

Her ubiquity was no fluke. Siri’s success was thanks to the collaboration of many technological heavyweights who contributed equally to different arenas of her development. Organizations and tech giants ranging from DARPA-funded SRI International to Nuance to of-course Apple were necessary to build a voice assistant capable of capturing the public’s attention and usage.

Today: Personal Voice Assistants

We’re now living in the voice revolution. Voice assistants and voice-enabled devices are prominent and prolific in their ability to understand what we are saying, assist us with daily tasks, enable voice commerce, and facilitate deep interactions with brands.

How Are Today’s Brands Finding Success With Voice?

We’re still on the frontier of voice assistants as a mainstream marketing and branding tool, but some brands are getting an early start within the voice space and as a result are building relationships with consumers through interactive experiences.

Estée Lauder

Beaty leader, Estée Lauder helps fans stay accountable for their nightly skin-care routine and provides beauty tips with their frictionless, interactive branding play on Google. Ask Google to talk to Liv at Estée Lauder to get product recommendations, tips, and reminders to wash and moisturize your face delivered to your phone.

Johnnie Walker

Whiskey giant Johnnie Walker is clinking glasses through an Alexa skill that educates consumers about whiskey, allows them to choose a label based on personal preferences, try a guided tasting, and buy a bottle from a nearby store or delivery service.

Johnnie Walker whiskey bottle, hand holding glass of whiskey and Amazon Alexa device

Source: Johnnie Walker


Pizza lovers can now voice order Domino’s through Google Home and Alexa. Taking it a step further, the pizza chain partnered with speech recognition platform Nuance to develop their own voice-powered “order-taking expert,” Dom that is accessible through the Domino’s app.

What Can Established Channels Like Phone Calls Teach Us About Voice Assistants?

28% of consumers call a business after finding them from a voice search. And callers convert to customers up to 15x more than web form leads. These are strong indicators of the power of voice as a revenue-driving channel.

Graph of the next steps after making a voice search for local

Source: BrightLocal

By understanding the value of the voice channel as a whole, marketers can yield more successful results from their voice marketing approach. And by looking at a traditional voice channel like consumer phone calls to their business, marketers can gain a deeper understanding of how to acquire more customers from voice search.

Think Locally

Consumers are demanding local information like never before. Across channels, search queries that included a local designation increased 900% over the last two years. Depending on the device, up to 53% of consumers use voice search to look for a local business on a daily basis. Often times, consumers searching for your business are not looking to find products or pricing online, but instead are looking for the phone number of the nearest location. Whether customers are searching for you through a browser or via voice, it’s important to tailor your content locally.

Consumers are most likely to use voice search to find local businesses with less considered services like restaurants, grocery stores, food delivery, and clothing services. However, a significant percentage are using voice search to investigate more considered services like childcare, home services, and senior living facilities.

bar chart of how frequently consumers use voice search to find a local business

Source: BrightLocal

Mine Calls for FAQS to Revamp Your SEO

Marketers can collect insights from phone conversations to learn what callers are saying and what questions they are asking and use the information to power content marketing and SEO. Marketers are doing this now by either manually reviewing call recordings and transcriptions themselves or by automating the process using artificial intelligence.

One company using call analysis to mine caller insights is Central Restaurant Products, the leading wholesale distributor of foodservice equipment. With inbound calls making up 56% of orders and 81% of total revenue, Central has a wealth of conversations to pull from. They analyze calls being driven from specific product pages, see what questions callers are asking, then update content to reflect the consumer questions.

By understanding the common questions customers ask, Central is not only optimizing their content for SEO but also alleviating customer concerns so that the sales team can focus less on education and more on driving conversions.

As a result, Central is able to create a seamless, end-to-end customer experience that significantly boosts conversion rates—increasing calls by 23% and new customers by 13%.

Central Restaurants voice analytics

Analyze Calls From Voice Search to Determine the Next Best Actions

When consumers call your business, it’s important for marketers to be able to access basic information about the caller, see what happens on the call, and make a decision as to how to market to them post-call.

By analyzing calls for intent and outcome, marketers can:

  • Retarget that caller with the most relevant search, social, and display ads
  • Expand their reach by using that caller to improve their lookalike campaigns
  • Exclude that caller from seeing ads that aren’t relevant to them

Comfort Keepers, one of the nation’s leading providers of in-home care for seniors, analyzes what happens on calls to each of their 450+ franchise locations to assess the success of their marketing efforts and determine which targeting campaigns each caller should be included in.

Since phone calls make up 70% of their marketing conversions, they analyze the calls driven to each franchise to determine lead quality. They can then understand not only the volume of calls they are driving, but how many of those calls are potential new customers versus current customers and which next steps should be taken.

how to put callers into audience segments

We’ve Never Been Closer Than With Voice

Voice marketing lets marketers get closer to consumers than ever before, and brands are craving deeper voice experiences. Marketers who are not tapping voice as both a branding and revenue-driving channel are falling behind.

Now is the time to ensure your brand is making the most of voice and take a deeper look at the conversations it’s already having with consumers.

Learn more about voice search and how to use it to drive more calls and customers with our free ebook, The Digital Marketer’s Guide to Voice Search.

Read More: