Here at Topsy, we’re always searching for interesting ways to use data from the social Web. Millions of people every day publicly offer their opinions, state their behaviors, and connect with each other via these online outlets, and their digital traces can give us information about trends in the real world.

One such real-world trend is in levels of physical inactivity, which are reported in percentages by county in the United States by the CDC. These percentages are of particular concern in recent times, as rates of metabolic disease are continuing to increase. We wondered: would Twitter mentions of physical activity — words like ‘gym’, ‘workout’, or ‘run’ — match up to the CDC county results? We examined county-level data from the state of Florida, notorious for its varied demographics. We took the numbers of tweets tagged with lat/long coordinates in each Florida county mentioning “gym”, “workout” or “run” over the past year, and expressed them as percentages of all Twitter activity in the county.

We mapped our results, and as you can see in Figure 1 below, the darker green counties in the first map (low mentions of working out on Twitter) for the most part match up with the darker orange counties in the second map (high rates of physical inactivity). Click the image below for a higher-res version:

Figure 1: county maps of Twitter workout activity, CDC percentages

As it turns out, physical inactivity rates (obtained from the CDC) by county were significantly and negatively correlated with % Twitter activity mentioning a workout by county (r = -.313, p < .01, figure 2). So, as counties in Florida were listed as more physically inactive, their relative mentions of ‘gym’, ‘workout’, and ‘run’ on Twitter generally decreased.

Figure 2: County Data Points

This is just one example of how Twitter chatter might be used to track the trends in the real world that matter. Keep up with the Topsy blog for other interesting results from the social Web.