As easy as it would be to hit you with a Matrix reference, I won’t. But, if you’ve ever watched binary code flash endlessly across a computer screen, you at least have a visual (though incomplete) concept of what “big data” is: data that’s so vast and dynamic that customary tools aren’t sufficient to capture and analyze it.

The daily data encounters we have usually revolve around obviously measurable things. The hidden reality, of course, is that data is everywhere, and everything is calculable — the finite number of particles you inhale, the molecules of water you drink. Even if we scale this notion back for business, we’re still faced with massive streams of information generated from social media activity, machine-to-machine communications, or interactions between humans and gadgets. And that’s where the dangers of big data start to emerge.

Follow Me, But Don’t Follow Me Around: We’re Still Concerned About Privacy

Is there anything else in the digital world that gets people worked up like internet privacy invasion? As a rule, internet citizens want total control over the cultivation of their online presence and, more importantly, who’s privy to it. In turn, social platforms, marketers, advertisers, and vendors of every breed have to perpetually strive to glean info that’s useful but not obtrusive. This isn’t anything new (just ask any one of these companies involved in the worst privacy scandals of all time).

The problem? Big data needs access in order to be valuable. Says Alex Pentland, a computational social scientist and director of the Human Dynamics Lab at the M.I.T., “This data is a new asset. You want it to be liquid and to be used.” That’s all well and good, but misrepresentation happens. Without appropriate and transparent restrictions, data collection runs the very serious risk of becoming skewed and subsequently, invasive. For example, if a team of health professionals is running an anti-smoking campaign, they might want to know the number of cigarette-centric Google searches entered over a period of time from a single region. But maybe those searches don’t take into account non-keywords (“what are the dangers of…”), or maybe they aren’t properly weighted against cigarette sales in the area. Immediately the data is shaky, and even worse, someone might be flagged as a smoker when they’re not. If that made it’s way back to a person’s insurance company — which it can — that person might lose coverage.

That’s an extreme example, of course. It could be something as annoying as getting banner ads for engagement rings because you happened to watch a friend’s wedding video. However, the idea of collecting data at all includes in-depth analysis, which for big data, means an approach rife with rules and accuracy checks.

On the Other Hand, a Little Invasion Might Save Lives

Just to play Devil’s advocate — or, note that a lot of people have accepted that internet anonymity isn’t really a thing — let’s acknowledge the value of identification. People use online search tools to do everything from conduct bank transactions to research illnesses, but the dark side of the web is vast. If big data can help prevent bad people from doing evil things before they strike, or save someone’s life who’s turned to a prevention or recovery forum, why not? I know, I know — Big Brother is watching and admittedly, I still get a little spooked when Facebook seems to know exactly how much reality TV I watch on Netflix, but are we really so tight-fisted about our internettings that we’re unwilling to sacrifice a little confidentiality for the greater good?

Some developers have already created smartphone apps that collect this type of preventative data. For example, people who suffer from depression can download Mobylize. If their movement patterns slow dramatically and sharply, they receive a check-in call to make sure everything’s okay. The great thing about apps, of course, is that you choose what you use. This won’t always be the case with all-things big data, but as tools evolve to handle it, studies are being done that assess how privacy concerns truly play into data analysis and translation.

As the practice of collecting big data matures, new best practices will emerge alongside inevitable setbacks; in the end, the goal is to derive innovation and solutions from trends and needs, both globally and locally, and drive everything from technological advancement to medical breakthroughs. It might just be a little bumpy along the way.

Want to know more? Check out this infographic from Wipro.