Comparative Geospatial Analysis of Twitter Sentiment Data during the 2008 and 2012 U.S. Presidential Elections
MetadataShow full item record
The goal of this thesis is to assess and characterize the representativeness of sampled data that is voluntarily submitted through social media. The case study vehicle used is Twitter data associated with the 2012 Presidential election, which were in turn compared to similarly collected 2008 Presidential election Twitter data in order to ascertain the representative statewide changes in the pro-Democrat bias of sentiment-derived Twitter data mentioning either of the Republican or Democrat Presidential candidates. The results of the comparative analysis show that the MAE lessened by nearly half - from 13.1% in 2008 to 7.23% in 2012 - which would initially suggest a less biased sample. However, the increase in the strength of the positive correlation between tweets per county and population density actually suggests a much more geographically biased sample.