People sometime look at me like I’m insane when I say that big data can be fun.

I have to say, nine times out of ten, they are totally justified in looking at me confused when I mention that I get just as much pleasure from analyzing a pivot table filled with ten thousand tweets mapped out and tracked by mentions as I do when I’m, you know… eating a steak or something.

However, I have come to discover that the entertainment value of big data really lies in more of a subjective analysis. What are you analyzing? And is it kind of silly? Thus the potential comedic value of utilizing big data is definitely there. You just have to look for it.

Recently, the Marketing Robot independently decided to use an enterprise ability to monitor tweets on the internet using topsy analytics to see just how often people in America were utilizing some of the more unsavory words in the English language on twitter, and where they were concentrated. The result is the below infographic, and the larger results are somewhat obvious in their audacity…. people curse on the Internet a lot, and they do it from pretty much everywhere.

Filthy-Twitter-Clean-100-px

A cursory review of the content of this graphic shows that, yes, indeed more tweets containing swear words are coming from the more populous states. This makes sense when you consider that the more people that live in a state potentially means more tweets (and therefore more swearing) coming from that state.

However, the volume of questionable tweets when looked at as a ratio of dirty tweets versus population data gives you a new comparison metric to evaluate this information with, one aptly dubbed “filthy mouths per capita”. When you take into account filthy mouth per capita (FMPC), then it is revealed that Rhode Island has the highest percentage of dirty tweets in relation to its population! See? I told you data could be (kind of) fun…