Analyzing Public Opinion with Social Media Data during Election Periods: A Selective Literature Review

  • Received : 2018.08.19
  • Accepted : 2018.08.28
  • Published : 2018.08.31


There have been many studies that applied a data-driven analysis method to social media data, and some have even argued that this method can replace traditional polls. However, some other studies show contradictory results. There seems to be no consensus as to the methodology of data collection and analysis. But as social media-based election research continues and the data collection and analysis methodology keep developing, we need to review the key points of the controversy and to identify ways to go forward. Although some previous studies have reviewed the strengths and weaknesses of the social media-based election studies, they focused on predictive performance and did not adequately address other studies that utilized social media to address other issues related with public opinion during elections, such as public agenda or information diffusion. This paper tries to find out what information we can get by utilizing social media data and what limitations social media data has. Also, we review the various attempts to overcome these limitations. Finally, we suggest how we can best utilize social media data in understanding public opinion during elections.


  1. An, S. K., & Gower, K. K. (2009). How do the news media frame crises? A content analysis of crisis news coverage. Public Relations Review, 35(2), 107-112. DOI: 10.1016/j.pubrev.2009.01.010
  2. Baldwin, T., Cook, P., Lui, M., MacKinlay, A., & Wang, L. (2013). How noisy social media text, how diffrnt social media sources?. In Proceedings of the Sixth International Joint Conference on Natural Language Processing (pp. 356-364).
  3. Barbera, P., & Rivero, G. (2015). Understanding the political representativeness of Twitter users. Social Science Computer Review, 33(6), 712-729. DOI: 10.1177/0894439314558836
  4. Barbera, P. (2016). Less is more? How demographic sample weights can improve public opinion estimates based on Twitter data. Working Paper.
  5. Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5), 662-679. DOI: 10.1080/1369118X.2012.678878
  6. Caldarelli, G., Chessa, A., Pammolli, F., Pompa, G., Puliga, M., Riccaboni, M., & Riotta, G. (2014). A multi-level geographical study of Italian political elections from Twitter data. PloS one, 9(5), e95809. DOI: 10.1371/journal.pone.0095809
  7. Cameron, M. P., Barrett, P., & Stewardson, B. (2016). Can social media predict election results? Evidence from New Zealand. Journal of Political Marketing, 15(4), 416-432. DOI: 10.1080/15377857.2014.959690
  8. Ceron, A., Curini, L., Iacus, S. M., & Porro, G. (2014). Every tweet counts? How sentiment analysis of social media can improve our knowledge of citizens' political preferences with an application to Italy and France. New Media & Society, 16(2), 340-358. DOI: 10.1177/1461444813480466
  9. Chang, J., Rosenn, I., Backstrom, L., & Marlow, C. (2010). ePluribus: Ethnicity on Social Networks. ICWSM, 10, 18-25.
  10. Chen, B. (2009). Latent topic modelling of word co-occurence information for spoken document retrieval. DOI: 10.1109/ICASSP.2009.4960495
  11. Chen, L., Wang, W., & Sheth, A. P. (2012, December). Are Twitter users equal in predicting elections? A study of user groups in predicting 2012 US Republican Presidential Primaries. In International Conference on Social Informatics (pp. 379-392). Springer, Berlin, Heidelberg.
  12. Choy, M., Cheong, M. L., Laik, M. N., & Shung, K. P. (2011). A sentiment analysis of Singapore Presidential Election 2011 using Twitter data with census correction. arXiv preprint arXiv:1108.5520.
  13. Choy, M., Cheong, M., Laik, M. N., & Shung, K. P. (2012). US presidential election 2012 prediction using census corrected Twitter model. arXiv preprint arXiv:1211.0938.
  14. Chu, Z., Gianvecchio, S., Wang, H., & Jajodia, S. (2010, December). Who is tweeting on Twitter: human, bot, or cyborg?. In Proceedings of the 26th annual computer security applications conference (pp. 21-30). ACM. DOI: 10.1145/1920261.1920265
  15. Chung, J. E., & Mustafaraj, E. (2011, April). Can collective sentiment expressed on twitter predict political elections?. In AAAI (Vol. 11, pp. 1770-1771).
  16. Clavel, C., & Callejas, Z. (2016). Sentiment analysis: from opinion mining to human-agent interaction. IEEE Transactions on affective computing, 7(1), 74-93. DOI: 10.1109/TAFFC.2015.2444846
  17. Couper, M. P. (2013, December). Is the sky falling? New technology, changing media, and the future of surveys. In Survey Research Methods (Vol. 7, No. 3, pp. 145-156). DOI: 10.18148/srm/2013.v7i3.5751
  18. Davis, D. H. (2017, July). Is Twitter a Generalizable Public Sphere?: A Comparison of 2016 Presidential Campaign Issue Importance among General and Twitter Publics. In Proceedings of the 8th International Conference on Social Media & Society (p. 31). ACM.Chicago. DOI: 10.1145/3097286.3097317
  19. Diaz, F., Gamon, M., Hofman, J. M., Kiciman, E., & Rothschild, D. (2016). Online and social media data as an imperfect continuous panel survey. PloS one, 11(1), e0145406. DOI: 10.1371/journal.pone.0145406
  20. DiGrazia, J., McKelvey, K., Bollen, J., & Rojas, F. (2013). More tweets, more votes: Social media as a quantitative indicator of political behavior. PloS one, 8(11), e79449. DOI: 10.1371/journal.pone.0079449
  21. Dimitrova, D. V., Kaid, L. L., Williams, A. P., & Trammell, K. D. (2005). War on the Web: The immediate news framing of Gulf War II. Harvard International Journal of Press/Politics, 10(1), 22-44. DOI:
  22. Dokoohaki, N., Zikou, F., Gillblad, D., & Matskin, M. (2015, August). Predicting swedish elections with twitter: A case for stochastic link structure analysis. In Advances in Social Networks Analysis and Mining (ASONAM), 2015 IEEE/ACM International Conference on (pp. 1269-1276). IEEE. DOI: 10.1145/2808797.2808915
  23. Flemming, G., & Sonner, M. (1999, May). Can Internet polling work? Strategies for conducting public opinion surveys online. In annual meeting of the American Association for Public Opinion Research, St. Petersburg Beach, FL.
  24. Gamson, W. A. (1989). News as framing: Comments on Graber. American behavioral scientist, 33(2), 157-161. DOI: 10.1177/0002764289033002006
  25. Gayo-Avello, D. (2011). Don't turn social media into another'Literary Digest'poll. Communications of the ACM, 54(10), 121-128. DOI: 10.1145/2001269.2001297
  26. Gayo-Avello, D. (2012a). "I Wanted to Predict Elections with Twitter and all I got was this Lousy Paper"--A Balanced Survey on Election Prediction using Twitter Data. arXiv preprint arXiv:1204.6441.
  27. Gayo-Avello, D. (2012b). No, you cannot predict elections with Twitter. IEEE Internet Computing, 16(6), 91-94. DOI: 10.1109/MIC.2012.137
  28. Gayo-Avello, D. (2013). A meta-analysis of state-of-the-art electoral prediction from Twitter data. Social Science Computer Review, 31(6), 649-679. DOI: 10.1177/0894439313493979
  29. Ghosh, S., Zafar, M. B., Bhattacharya, P., Sharma, N., Ganguly, N., & Gummadi, K. (2013, October). On sampling the wisdom of crowds: Random vs. expert sampling of the twitter stream. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management (pp. 1739-1744). ACM. DOI: 10.1145/2505515.2505615
  30. Goldstein, P., & Rainey, J. (2010). The 2010 elections: Twitter isn't a very reliable prediction tool. Retrieved January, 10, 2012.
  31. Haustein, S., Bowman, T. D., Holmberg, K., Tsou, A., Sugimoto, C. R., & Lariviere, V. (2016). Tweets as impact indicators: Examining the implications of automated "bot" accounts on T witter. Journal of the Association for Information Science and Technology, 67(1), 232-238. DOI: 10.1002/asi.23456
  32. Jungherr, A., Jurgens, P., & Schoen, H. (2012). Why the pirate party won the german election of 2009 or the trouble with predictions: A response to tumasjan, a., sprenger, to, sander, pg, & welpe, im "predicting elections with twitter: What 140 characters reveal about political sentiment". Social science computer review, 30(2), 229-234. DOI: 0.1177/0894439311404119
  33. Kalampokis, E., Tambouris, E., & Tarabanis, K. (2013). Understanding the predictive power of social media. Internet Research, 23(5), 544-559. DOI: 10.1108/IntR-06-2012-0114
  34. Karimi, F., Wagner, C., Lemmerich, F., Jadidi, M., & Strohmaier, M. (2016, April). Inferring gender from names on the web: A comparative evaluation of gender detection methods. In Proceedings of the 25th International Conference Companion on World Wide Web (pp. 53-54). International World Wide Web Conferences Steering Committee. DOI: 10.1145/2872518.2889385
  35. Khatua, A., Khatua, A., Ghosh, K., & Chaki, N. (2015, January). Can# twitter_trends predict election results? Evidence from 2014 indian general election. In System Sciences (HICSS), 2015 48th Hawaii International Conference on (pp. 1676-1685). IEEE. DOI: 10.1109/HICSS.2015.202
  36. Krstajic, M., Mansmann, F., Stoffel, A., Atkinson, M., & Keim, D. A. (2010, March). Processing online news streams for large-scale semantic analysis. In Data Engineering Workshops (ICDEW), 2010 IEEE 26th International Conference on (pp. 215-220). IEEE. DOI: 10.1109/ICDEW.2010.5452710
  37. Li, H., Jou, B., Ellis, J. G., Morozoff, D., & Chang, S. F. (2013, October). News rover: exploring topical structures and serendipity in heterogeneous multimedia news. In Proceedings of the 21st ACM international conference on Multimedia (pp. 449-450). ACM. DOI: 10.1145/2502081.2502263
  38. Lui, C., Metaxas, P. T., & Mustafaraj, E. (2011). On the predictability of the US elections through search volume activity.
  39. Malik, M. M., Lamba, H., Nakos, C., & Pfeffer, J. (2015). Population bias in geotagged tweets. People, 1(3,759.710), 3-759.
  40. Marchetti-Bowick, M., & Chambers, N. (2012, April). Learning for microblogs with distant supervision: Political forecasting with twitter. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (pp. 603-612). Association for Computational Linguistics.
  41. Metaxas, P. T., Mustafaraj, E., & Gayo-Avello, D. (2011, October). How (not) to predict elections. In Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third International Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on (pp. 165-171). IEEE. DOI: 10.1109/PASSAT/SocialCom.2011.98
  42. Mislove, A., Lehmann, S., Ahn, Y. Y., Onnela, J. P., & Rosenquist, J. N. (2011). Understanding the Demographics of Twitter Users. ICWSM, 11(5th), 25.
  43. Morstatter, F., Dani, H., Sampson, J., & Liu, H. (2016, April). Can one tamper with the sample api?: Toward neutralizing bias from spam and bot content. In Proceedings of the 25th International Conference Companion on World Wide Web (pp. 81-82). International World Wide Web Conferences Steering Committee.DOI: 10.1145/2872518.2889372
  44. Morstatter, F., Pfeffer, J., & Liu, H. (2014, April). When is it biased?: assessing the representativeness of twitter's streaming API. In Proceedings of the 23rd International Conference on World Wide Web (pp. 555-556). ACM.
  45. Morstatter, F., Pfeffer, J., Liu, H., & Carley, K. M. (2013, July). Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose. In ICWSM.
  46. Murphy, J., Link, M. W., Childs, J. H., Tesfaye, C. L., Dean, E., Stern, M., ... & Harwood, P. (2014). Social media in public opinion research: executive summary of the Aapor task force on emerging technologies in public opinion research. Public Opinion Quarterly, 78(4), 788-794. DOI: 10.1093/poq/nfu053
  47. O'Connor, B., Balasubramanyan, R., Routledge, B. R., & Smith, N. A. (2010). From tweets to polls: Linking text sentiment to public opinion time series. Icwsm, 11(122-129), 1-2.
  48. O'Leary, D. E. (2015). Twitter mining for discovery, prediction and causality: Applications and methodologies. Intelligent Systems in Accounting, Finance and Management, 22(3), 227-247. DOI: 10.1002/isaf.1376
  49. Papadopoulos, S., Corney, D., & Aiello, L. M. (2014, April). SNOW 2014 Data Challenge: Assessing the Performance of News Topic Detection Methods in Social Media. In SNOW-DC@ WWW (pp. 1-8).
  50. Petrovic, S., Osborne, M., & Lavrenko, V. (2011). Rt to win! predicting message propagation in twitter. ICWSM, 11, 586-589.
  51. Phillips, L., Dowling, C., Shaffer, K., Hodas, N., & Volkova, S. (2017). Using social media to predict the future: a systematic literature review. arXiv preprint arXiv:1706.06134.
  52. Pimenta, F., Obradovic, D., & Dengel, A. (2013, September). A comparative study of social media prediction potential in the 2012 us republican presidential preelections. In Cloud and Green Computing (CGC), 2013 Third International Conference on (pp. 226-232). IEEE. DOI: 10.1109/CGC.2013.43
  53. Ratkiewicz, J., Conover, M., Meiss, M. R., Goncalves, B., Flammini, A., & Menczer, F. (2011). Detecting and tracking political abuse in social media. ICWSM, 11, 297-304.
  54. Reyes, A., & Rosso, P. (2014). On the difficulty of automatically detecting irony: beyond a simple case of negation. Knowledge and Information Systems, 40(3), 595-614.
  55. Roeder, O., Mehta, D., & Wezerek, G. (2017, Oct 24). The worst Tweeter in politics isn't Trump. FiveThirtyEight Retrieved from
  56. Ruths, D., & Pfeffer, J. (2014). Social media for large studies of behavior. Science, 346(6213), 1063-1064. DOI: 10.1126/science.346.6213.1063
  57. Sang, E. T. K., & Bos, J. (2012, April). Predicting the 2011 dutch senate election results with twitter. In Proceedings of the workshop on semantic analysis in social media (pp. 53-60). Association for Computational Linguistics.
  58. Schober, M. F., Pasek, J., Guggenheim, L., Lampe, C., & Conrad, F. G. (2016). Social media analyses for social measurement. Public opinion quarterly, 80(1), 180-211. DOI: 10.1093/poq/nfv048
  59. Schoen, H., Gayo-Avello, D., Takis Metaxas, P., Mustafaraj, E., Strohmaier, M., & Gloor, P. (2013). The power of prediction with social media. Internet Research, 23(5), 528-543. DOI: 10.1108/IntR-06-2013-0115
  60. Scott Keeter, Kyley McGeeney, Ruth Igielnik, Andrew Mercer, Nancy Mathiowetz (2015, May 13), From Telephone to the Web: The Challenge of Mode of Interview Effects in Public Opinion Polls. Retrieved from
  61. Semetko, H. A., & Valkenburg, P. M. (2000). Framing European politics: A content analysis of press and television news. Journal of communication, 50(2), 93-109. DOI: 10.1111/j.1460-2466.2000.tb02843.x
  62. Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22-36. DOI: 10.1145/3137597.3137600
  63. Sjovaag, H., & Stavelin, E. (2012). Web media and the quantitative content analysis: Methodological challenges in measuring online news content. Convergence, 18(2), 215-229. DOI: 10.1177/1354856511429641
  64. Skoric, M., Poor, N., Achananuparp, P., Lim, E. P., & Jiang, J. (2012, January). Tweets and votes: A study of the 2011 singapore general election. In System Science (HICSS), 2012 45th Hawaii International Conference on (pp. 2583-2591). IEEE. DOI: 10.1109/HICSS.2012.607
  65. Stephens-Davidowitz, S. (2014). The cost of racial animus on a black candidate: Evidence using Google search data. Journal of Public Economics, 118, 26-40. DOI: 10.1016/j.jpubeco.2014.04.010
  66. Stieglitz, S., & Dang-Xuan, L. (2012, January). Political communication and influence through microblogging--An empirical analysis of sentiment in Twitter messages and retweet behavior. In System Science (HICSS), 2012 45th Hawaii International Conference on (pp. 3500-3509). IEEE. DOI: 10.1109/HICSS.2012.476
  67. Tian, Y., & Stewart, C. M. (2005). Framing the SARS crisis: A computer-assisted text analysis of CNN and BBC online news reports of SARS. Asian Journal of Communication, 15(3), 289-301. DOI: 10.1080/01292980500261605
  68. Tufekci, Z. (2014). Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls. ICWSM, 14, 505-514.
  69. Tumasjan, A., Sprenger, T. O., Sandner, P. G., & Welpe, I. M. (2010). Predicting elections with twitter: What 140 characters reveal about political sentiment. Icwsm, 10(1), 178-185.
  70. Wang, C., Zhang, M., Ru, L., & Ma, S. (2008, October). Automatic online news topic ranking using media focus and user attention based on aging theory. In Proceedings of the 17th ACM conference on Information and knowledge management (pp. 1033-1042). ACM. DOI: 10.1145/1458082.1458219
  71. Wang, W., Rothschild, D., Goel, S., & Gelman, A. (2015). Forecasting elections with nonrepresentative polls. International Journal of Forecasting, 31(3), 980-991. DOI: 10.1016/j.ijforecast.2014.06.001
  72. Weller, K. (2015). Accepting the challenges of social media research. Online Information Review, 39(3), 281-289. DOI: 10.1108/OIR-03-2015-0069
  73. Williams, C. B., & Gulati, G. (2008). The political impact of Facebook: Evidence from the 2006 midterm elections and 2008 nomination contest. Politics and Technology Review, 1(1), 11-24.
  74. Yu, S., & Kak, S. (2012). A survey of prediction using social media. arXiv preprint arXiv:1203.1647.
  75. Zhang, W., Ram, S., Burkart, M., & Pengetnze, Y. (2016, April). Extracting signals from social media for chronic disease surveillance. In Proceedings of the 6th International Conference on Digital Health Conference (pp. 79-83). ACM. DOI: 10.1145/2896338.2897728