Earlier this year, the White House announced it named Dr. DJ Patil as its very first Chief Data Scientist. With the claim that the U.S. has seen “an acceleration of the power of data to deliver value,” White House CTO Megan Smith (formerly of Google) has further reinforced the notion that making data-driven decisions is becoming pervasive – not just in bleeding edge startups or the biggest technology companies.

Everywhere you turn, companies are hopping on the bandwagon and bringing in someone to be their dedicated in-house “Data Scientist.” People like Jonathan Goldman at LinkedIn have paved the way for this new career, one that Harvard Business Review even called the “Sexiest Job of the 21st Century.”

So – why is this happening now – almost 30 years after the first dot com domain was registered and the floodgates of digital data opened?

It’s no big secret that the amount of data companies are generating is exploding exponentially. As of 2012, about 2.5 exabytes of data (2.5 billion gigabytes) was being created each day, and that number is doubling approximately every 40 months. The volume is growing and it could be argued the speed at which the data is coming in is growing even faster.

At the same time, this exponential increase means the tools and skills needed to make sense of that data has become a specialized skillset, and not just another tool in a developer or marketing analyst’s toolkit.

Enter: the data scientist.

Do I Need A Data Scientist?

Before you get ahead of yourself and start making job posts tomorrow to hire your first data scientist, have a discussion with your team about whether or not you even need a data scientist.

There are three key questions to ask:

  1. Do we have the need to build complex predictive models about the future versus analyzing large volumes of our existing customer data?
  2. Do we have the capability and appetite to manage our own complex data infrastructure?
  3. Is the majority of your data generated from user’s engagement with a website or mobile app? (i.e. is your business primarily online and is there repeat user engagement versus a one-time interaction?)

If the answer is “no” to any of the above of them, then you’re better off using the right 3rd party analytics platform that your non-technical users can use day-to-day. Even if you’re generating billions of data points a month from hundreds of millions of users, using a 3rd party tool can potentially still lower total cost of ownership and free up valuable time from both your engineering team (who won’t have to support your infrastructure), and your sales, marketing, product, operations, customer success, etc. teams that won’t have to learn on internal teams for support.

However, there are twp cases where you should consider building out your own data science team – 1) if you need to build predictive models, or 2) a significant portion of your business happens offline. If either are true, and you can additionally support building out your own data infrastructure then you should consider building out a data science team.

What Do I Look For When Hiring A Data Scientist?

So, let’s say it is a priority for your company – now what?

I often get asked this question by both startup founders and Fortune 500 CTOs. As the analytics field and data needs have evolved so rapidly, often your best potential hires didn’t have the title “data scientist” previously, or were working in an analytical role in a specific area like marketing or operations.

Based on personal experience and feedback from founders, the top things I’ve found to look for when hiring your first data scientist are:

  • Pick Intelligence and Innovation First. Data Science is still a new space – you need someone who can be ahead of the curve with the ability to adapt fast to new techniques. Finding someone who is smarter than you and can work with you to build something great is also crucial.
  • Deep Analytics & Statistics Background is Table Stakes. While there is no one-size-fits all experience to look for on the resume, make sure you hire incredibly analytical people who understand statistics. Look for individuals with experience using analytic tools, data platforms, physics degrees or anything in between to ensure they know what they are doing. Some studies believe we could have a shortage of people with this expertise in the next couple of years, so start identifying these people now.
  • Focused on Outcomes, Not Territory. Some otherwise brilliant data scientist candidates get caught up being the wise and powerful keepers of the data, and resist adopting new tools and processes that empower non-technical users with data tools. I’ve seen repeatedly that up to 90 percent of a data scientists time is spent just cleaning and scrubbing data from internal data sources because not enough investment is made in automation and data scientists fear making non-technical users too independent. The right candidates will eliminate as much data scrubbing as possible, empower their non-technical teams with the right tools to make day-to-day decisions, and focus their own efforts on the truly hard and strategic data science problems your company faces.
  • Hire From Within. At the end of the day, you need someone who understands your business to make sense of the data. Hiring from within, while not always an option, it can be one of the best ways to get a candidate who understands your business. With the growth of online courses specifically targeted at doing data science in the wild, you can potentially upskill a high potential, highly analytical team member in less time than it takes to hire an outsider.

How Do You Become A Data Scientist?

If you’re on the other side of the fence and looking to become a data scientist, there are several things you should consider. Here’s a list of top things I advise fledgling data scientists to do when they ask me this question:

  • Get trained. If you don’t already have an existing statistics or data background, you don’t have to go back to school to get the learning you need. There are a variety of in-person and web-based training programs available.
  • Get connected. Find people to talk to who actually do this for a living, as well as those that can help advise you on what you need to know. If you don’t know anyone, research people online and reach out to them to set up time to talk. If that’s going a little too far outside your comfort zone, then find people who use analytics tools on a daily basis to better understand what they do everyday.
  • Get practice. There’s no better way to learn if this is right for you than to get your feet wet. There are a variety of places online you can try out your hand at real world problems like Kaggle, Cloudera’s Data Challenges and Topcoder.