DOI QR코드

DOI QR Code

Exploring Opinions on COVID-19 Vaccines through Analyzing Twitter Posts

트위터 게시물 분석을 통한 코로나바이러스감염증-19 백신에 대한 의견 탐색

  • 정우진 (성균관대학교 문헌정보학과) ;
  • 김규리 (성균관대학교 문헌정보학과) ;
  • 유승희 (성균관대학교 데이터사이언스융합전공) ;
  • 주영준 (성균관대학교 문헌정보학과)
  • Received : 2021.11.15
  • Accepted : 2021.12.20
  • Published : 2021.12.30

Abstract

In this study, we aimed to understand the public opinion on COVID-19 vaccine. To achieve the goal, we analyzed COVID-19 vaccine-related Twitter posts. 45,413 tweets posted from March 16, 2020 to March 15, 2021 including COVID-19 vaccine names as keywords were collected. The 12 vaccine names used for data collection included 'Pfizer', 'AstraZeneca', 'Modena', 'Jansen', 'NovaVax', 'Sinopharm', 'SinoVac', 'Sputnik V', 'Bharat', 'KhanSino', 'Chumakov', and 'VECTOR' in the order of the number of collected posts. The collected posts were analyzed manually and automatedly through keyword analysis, sentiment analysis, and topic modeling to understand the opinions for the investigated vaccines. According to the results, there were generally more negative posts about vaccines than positive posts. Anxiety about the aftereffects of vaccination and distrust in the efficacy of vaccines were identified as major negative factors for vaccines. On the contrary, the anticipation for the suppression of the spread of coronavirus following vaccination was identified as a positive social factor for vaccines. Different from previous studies that investigated opinions about COVID-19 vaccines through mass media data such as news articles, this study explores opinions of social media users using keyword analysis, sentiment analysis, and topic modeling. In addition, the results of this study can be used by governmental institutions for making policies to promote vaccination reflecting the social atmosphere.

본 연구는 코로나바이러스감염증-19(이하 코로나바이러스) 백신에 대한 사회적 의견을 파악하기 위해 트위터에서 작성된 백신 관련 게시물들을 분석하였다. 2020년 3월 16일부터 2021 3월 15일까지 1년간 트위터에서 작성된 코로나바이러스 백신 이름을 키워드로 포함한 45,413개의 게시물을 수집하여 분석하였다. 데이터 수집을 위해 활용된 코로나바이러스 백신 키워드는 총 12개이며, 수집된 게시물 수순으로 '화이자', '아스트라제네카', '모더나', '얀센', '노바백스', '시노팜', '시노백', '스푸트니크', '바라트', '캔시노', '추마코프', '벡토르'이다. 수집된 게시물들은 수기와 자동화된 방법을 동시 활용하여 키워드 분석, 감성 분석, 및 토픽모델링을 통하여 백신들에 대한 의견을 탐색하였다. 연구결과에 따르면 전반적으로 백신에 대한 부정적인 반응이 많았으며, 백신 접종 후유증에 대한 불안 및 백신의 효능에 대한 불신이 백신들에 대한 부정적인 주요 요소로 파악되었다. 이와는 반대로, 백신 접종에 따른 코로나바이러스 확산 억제에 대한 기대감이 백신에 대한 긍정적인 사회적 요소인 것을 확인할 수 있었다. 본 연구는 기존의 선행연구들이 뉴스 등 대중매체 데이터를 통해 코로나바이러스 백신에 대한 사회적 분위기를 파악하고자 했던 것과 달리, 소셜 미디어 데이터 수집 및 이를 활용한 키워드 분석, 감성 분석, 토픽 모델링 등의 여러 분석방법들을 사용하여 대중들의 의견을 파악하는 것으로 학술적 의의를 지닌다. 또한, 본 연구의 결과는 백신에 대한 사회적 분위기를 반영한 백신 접종 권장 정책 수립 기여라는 실질적 함의를 시사한다.

Keywords

References

  1. Central Quarantine Countermeasure Headquarters (2021). COVID-19 domestic testing and confirmed cases. http://ncov.mohw.go.kr/
  2. Choi, S. M. (2021). The present and future of vaccines for pandemic. Oughtopia, 36(1), 5-38. http://doi.org/10.32355/oughtopia.2021.06.36.1.5
  3. Choi, W. J. & Hong, J. S. (2021). A study on the search keyword pattern of COVID-19 in the domestic media. Korean Journal of Communication Studies, 29(2), 29-58. http://doi.org/10.23875/kca.29.2.2
  4. Jin, S. A., Heo, G. E., Jeong, Y. K., & Song, M. (2013). Topic-Network based topic shift detection on twitter. Journal of the Korean Society for Information Management, 30(1), 285-302. http://doi.org/10.3743/KOSIM.2013.30.1.285
  5. Kim, J. H. (2021. November 6). 2,248 new confirmed cases... Spread of 2,000 people for four days in a row. Yonhap News TV. Available: https://www.yonhapnewstv.co.kr/news/MYH20211106003000641?did=1825m
  6. Kim, S. Y. (2021. February 21). To achieve '70% herd immunity'..."Actually, 90% of the nation should be vaccinated". Yonhap News, Available: https://www.yna.co.kr/view/AKR20210219114600530
  7. Kim, T. J. (2020). COVID-19 news analysis using news big data: focusing on topic modeling analysis. The Journal of the Korea Contents Association, 20(5), 457-466. http://doi.org/10.5392/JKCA.2020.20.05.457
  8. Park, E. J. & Cho, S. J. (2014). KoNLPy: Korean natural language processing in python. In Proceedings of the 26th Annual Conference on Human and Cognitive Language Technology, Chuncheon, 133-136.
  9. Park, S. M., Na, C. W., Choi, M. S., Lee, D. H., & On, B. W. (2018). KNU Korean sentiment lexicon - Bi-LSTM-based method for building a Korean sentiment lexicon. Journal of Intelligence and Information Systems, 24(4), 219-240. http://doi.org/10.13088/jiis.2018.24.4.219
  10. Seo, H. R. & Song, M. (2019). An analysis of the discourse topics of users who exhibit symptoms of depression on social media. Journal of the Korean Society for Information Management, 36(4), 207-226. http://doi.org/10.3743/KOSIM.2019.36.4.207
  11. Seong, B. L. (2021). COVID-19 vaccine research and development. Orbis Sapientiae, 30, 117-127.
  12. Yoo, S. Y. & Lim, G. G. (2021). Analysis of news agenda using text mining and semantic network analysis: focused on COVID-19 emotions. Journal of Intelligence and Information Systems, 27(1), 47-64. http://doi.org/10.13088/jiis.2021.27.1.047
  13. Zhu, Y. J., Kim, D. H., Lee, C. H., & Yi, Y. J. (2019). Investigating major topics through the analysis of depression-related facebook group posts. Journal of the Korean Library and Information Science, 53(4), 171-187. http://doi.org/10.4275/KSLIS.2019.53.4.171
  14. Blei, D. M., Ng, A. Y., & Jordan, M. (2003). "Latent Dirichlet Allocation." Journal of Machine Learning Research, 3(4/5), 993-1022.
  15. Domenico, C. & Maurizio, V. (2020). WHO declares COVID-19 a pandemic. Acta Bio Medica: Atenei Parmensis, 91(1), 157-160. PubMed. http://doi.org/10.23750/abm.v91i1.9397
  16. Hardle, W., Chen, C. H., & Overbeck L. eds. Applied Quantitative Finance. Statistics and Computing.
  17. JustAnotherArchivist (2020). snscrape 0.3.4. https://github.com/JustAnotherArchivist/snscrape
  18. Lara, T., Filippo, Q., Eleonora, D., Pietro, D., Marco, V., Francesco, M., & Luigi, L., Pier. (2020). Twitter as a sentinel tool to monitor public opinion on vaccination: an opinion mining analysis from September 2016 to August 2017 in Italy. Human Vaccines & Immunotherapeutics, 16(5), 1062-1069. PubMed. http://doi.org/10.1080/21645515.2020.1714311
  19. Linton, M., Teo, E., Bommes, E., Chen, C., & H, Wolfgang Karl. (2017). Dynamic Topic Modelling for Cryptocurrency Community Forums. Springer, Berlin, Heidelberg. http://doi.org/10.1007/978-3-662-54486-0_18
  20. Lucia, P. S., Manuel, T., Juan, Diego, F. P.-B., Almudena, J., Manuel, C., Ernestina, M., Antonio, C. F., Amalia, A., Angel, G. de M., & Alejandro, R. G. (2021). Influenza and Measles-MMR: two case study of the trend and impact of vaccine-related Twitter posts in Spanish during 2015-2018. Human Vaccines & Immunotherapeutics, 1-15. http://doi.org/10.1080/21645515.2021.1877597
  21. Mathieu, E., Ritchie, H., Ortiz-Ospina, E., Roser, M., Appel, C., Giattino, C., & Rodes-Huirao, L. (2021). A global database of COVID-19 vaccinations. Nat Hum Behav. http://doi.org/10.1038/s41562-021-01122-8
  22. Rehurek, R. & Sojka, P. (2011). Gensim--python framework for vector space modelling. NLP Centre, Faculty of Informatics, Masaryk University, Brno, Czech Republic, 3(2).
  23. Samira, Y., Rozita, D., Samira, M., Andrew, P., & Shayan, S. (2021). An analysis of COVID-19 vaccine sentiments and opinions on Twitter. International Journal of Infectious Diseases, 108, 256-262. http://doi.org/10.1016/j.ijid.2021.05.059
  24. Worldometers.info. (2021). Coronavirus Cases - Worldometer. Available: https://www.worldometers.info/