Fabio Rojas, an associate professor of sociology at Indiana University, posted a little piece about polling in the Washington Post entitled “How Twitter can help predict an election.” The title is titillating, but fairly innocuous. Not so for the Post’s article in which Rojas asserts that:
- Twitter “discussions” can predict elections so we’ll be able to forget about elections and choose our representatives via digital democracy
- Online public information will replace the polling industry
- Anyone who can write computer code can offer the same analysis as a trained polling professional (pollster)
Whoa! I better find a new job teaching stand up paddle boarding.
But let’s take a look at those assertions. First, the races Rojas included in his study were mostly what we consider non-competitive. Included in his analysis were all House races where there were two candidates running, so a race pitting a heavy favorite against a longshot equaled “competitive” in Rojas’s world. Obviously, it is a lot easier to predict a winner when most races are foregone conclusions.
More importantly, that 404 of 406 record was based on a model that included a host of other predictive variables. The percentage voting for McCain, the percentage of white voters, and the percentage of college educated voters were added to the tweets to create a model with an adjusted R2 of .87. All that means is that elections can accurately be predicted when a bunch of other variables that we know predict election outcomes are included in the equation. When Rojas’s model consisted of the tweets alone, the R2 drops to .28! So it the other variables — not the tweets — that are doing the robust predicting.
Some observers say it is only a matter of time before Twitter (or another social media app) achieves 80% coverage at which point it would be supposedly substitute for pre-election polling. But the main job of a campaign poll is not to predict the outcome of an election. A campaign poll’s main purpose is to provide strategic advice. That can’t be done with variables coming from Twitter feeds.
Then there’s the scary prospect posed by this article from Andrea Peterson (coincidentally posted the same day as Rojas’s and also from the Washington Post). Peterson’s thesis is that the moment Twitter becomes useful for anything regarding campaigns and elections, consultants will work overtime to defeat it. Because all social media is public, I don’t see anything to stop the bots from mucking up future campaigns.
It sure would be great — except for the interviewers who get paid to do the hard work — to magically acquire cheap, accurate and useable public opinion data. But at the end of the day, there is no magic.