DOI QR코드

DOI QR Code

Prediction of Covid-19 confirmed number of cases using ARIMA model

ARIMA모형을 이용한 코로나19 확진자수 예측

  • Kim, Jae-Ho (Department of Computer Science, The University of Suwon) ;
  • Kim, Jang-Young (Department of Computer Science, The University of Suwon)
  • Received : 2021.11.04
  • Accepted : 2021.11.27
  • Published : 2021.12.31

Abstract

Although the COVID-19 outbreak that occurred in Wuhan, Hubei around December 2019, seemed to be gradually decreasing, it was gradually increasing as of November 2020 and June 2021, and estimated confirmed cases were 192 million worldwide and approximately 184 thousand in South Korea. The Central Disaster and Safety Countermeasures Headquarters have been taking strong countermeasures by implementing level 4 social distancing. However, as the highly infectious COVID-19 variants, such as Delta mutation, have been on the rise, the number of daily confirmed cases in Korea has increased to 1,800. Therefore, the number of cumulative confirmed COVID-19 cases is predicted using ARIMA algorithms to emphasize the severity of COVID-19. In the process, differences are used to remove trends and seasonality, and p, d, and q values are determined and forecasted in ARIMA using MA, AR, autocorrelation functions, and partial autocorrelation functions. Finally, forecast and actual values are compared to evaluate how well it was forecasted.

2019년 12월경 후베이 우한시에서 발생한 코로나19 바이러스가 점차 줄어드는 듯 보였으나, 2020년 11월, 2021년 6월 기준으로 점차 늘어나고 있으며, 전세계적으로 총 1억 9천 2백만명, 대한민국 기준 총 확진자는 대략 18만4천명으로 추정된다. 이에 따른 대책으로 중앙재난안전대책본부는 사회적 거리두기 4단계를 시행하면서 강력한 대응책을 내고있지만, 델타바이러스등 전염성이 강한 코로나 변이 바이러스가 기승을 부리면서 국내 일일 확진자 수는 1800명대 까지 증가하게 되었다. 그에따라 코로나바이러스의 심각성을 강조하고자 코로나 누적 확진자 수를 ARIMA 알고리즘을 이용해 예측한다. 그 과정에서 추세와 계절성을 제거하기 위해서 차분을 이용하고, MA, AR, 자기상관함수와 편자기상관함수를 이용해 ARIMA에서 p,d,q값을 결정하고 예측한다. 마지막으로 예측값과 실제값을 비교해 얼마나 잘 예측되었는지 평가한다.

Keywords

References

  1. Korea Centers for Disease Control and Prevention, COVID pademic [Internet]. Available: http://ncov.mohw.go.kr/tcmBoardView.do?brdId=&brdGubun=&dataGubun=&ncvContSeq=366394&contSeq=366394&board_id=140&gubun=BDJ.
  2. Centers for Disease Control and Provention. SARS-CoV-2 Variant Classifications and Definitions [Internet]. Available: https://www.cdc.gov/coronavirus/2019-ncov/variants/variant-info.html.
  3. Our World in data, Statistics and ResearchCoronavirus (COVID-19) Vaccinations [Internet]. Available: https://ourworldindata.org/covid-vaccinations.
  4. Coronavirus Infectious Disease-19 (COVID-19), social distancing step 4 [Internet]. Available: http://ncov.mohw.go.kr/shBoardView.do?brdId=6&brdGubun=64&ncvContSeq=5619.
  5. D. H. Kim, M. W. Kim, B. J. Lee, K. T. Kim, and H Y Youn, "Data Flow Prediction Scheme using ARIMA Model," Proceedings of the 2018 Korean Society for Computer and Information Sciences Summer Conference, vol. 26, no. 2, pp. 141-142, 2018.
  6. S. Siami-Namini, N. Tavakoli, and A. S. Namin, "A comparison of ARIMA and LSTM in forecasting time series," 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1394-1401, Dec. 2018.
  7. D. Serge and L. Sophie, "Characterization of the partial autocorrelation function of nonstationary time series," Journal of Multivariate Analysis, vol. 87, no. 1, pp. 46-59, Oct. 2003. https://doi.org/10.1016/S0047-259X(03)00025-3
  8. P. Giovanni, P. Sonia, and C. Patrizia, "Dynamic Linear Models with R," Springer Science & Business Media, pp. 31-84, May. 2009.
  9. Kaggle data set. covid19-data-from-john-hopkins-university [Internet]. Available: https://www.kaggle.com/antgoldbloom/covid19-data-from-john-hopkins-university.