DOI QR코드

DOI QR Code

Comparative Analysis of the Binary Classification Model for Improving PM10 Prediction Performance

PM10 예측 성능 향상을 위한 이진 분류 모델 비교 분석

  • Jung, Yong-Jin (Department of Electrical, Electronics and Communication Engineering, Korea University of Technology and Education(KOREATECH)) ;
  • Lee, Jong-Sung (Department of Electrical, Electronics and Communication Engineering, Korea University of Technology and Education(KOREATECH)) ;
  • Oh, Chang-Heon (Department of Electrical, Electronics and Communication Engineering, Korea University of Technology and Education(KOREATECH))
  • Received : 2020.09.29
  • Accepted : 2020.10.14
  • Published : 2021.01.31

Abstract

High forecast accuracy is required as social issues on particulate matter increase. Therefore, many attempts are being made using machine learning to increase the accuracy of particulate matter prediction. However, due to problems with the distribution of imbalance in the concentration and various characteristics of particulate matter, the learning of prediction models is not well done. In this paper, to solve these problems, a binary classification model was proposed to predict the concentration of particulate matter needed for prediction by dividing it into two classes based on the value of 80㎍/㎥. Four classification algorithms were utilized for the binary classification of PM10. Classification algorithms used logistic regression, decision tree, SVM, and MLP. As a result of performance evaluation through confusion matrix, the MLP model showed the highest binary classification performance with 89.98% accuracy among the four models.

미세먼지 예보에 대한 높은 정확도가 요구됨에 따라 기계 학습의 알고리즘을 적용하여 예측 정확도를 높이려는 다양한 시도들이 이루어지고 있다. 그러나 미세먼지의 특성과 불균형적인 농도별 발생 비율에 대한 문제로 예측 모델의 학습 및 예측이 잘 이루어지지 않는다. 이러한 문제를 해결하기 위해 특정 농도를 기준으로 미세먼지를 저농도와 고농도로 구분하여 예측을 수행하는 등 다양한 연구가 진행되고 있다. 본 논문에서는 미세먼지 농도의 불균형 특성으로 인한 예측 성능 향상의 문제를 해결하기 위한 미세먼지 농도의 이진 분류 모델을 제안하였다. 분류 알고리즘 중 logistic regression, decision tree, SVM 및 MLP를 이용하여 PM10에 대한 이진분류 모델들을 설계하였다. 오차 행렬을 통해 성능을 비교한 결과, 4가지 모델 중 MLP 모델이 89.98%의 정확도로 가장 높은 이진 분류 성능을 보였다.

Keywords

References

  1. M. S. Seo, "The Impact of Particulate Matter on Economic Activity," The Korean Women Economists Association, vol. 12, no. 1, pp. 75-100, Jun. 2015.
  2. A. Valavanidis, K. Fiotakis, and T. Vlachogianni, "Airborne Particulate Matter and Human Health: Toxicological Assessment and Importance of Size and Composition of Particles for Oxidative Damage and Carcinogenic Mechanisms," Journal of Environmental Science and Health, Part C, vol. 26, no. 4, pp. 339-362, Nov. 2008. https://doi.org/10.1080/10590500802494538
  3. K. H. Kim, E. Kabir, and S. Kabir, "A Review on the Human Health Impact of Airborne Particulate Matter," Environment International, vol. 74, pp. 136-143, Jan. 2015. https://doi.org/10.1016/j.envint.2014.10.005
  4. World Health Organization(WHO), "Health effects of particulate matter. Policy implications for countries in eastern Europe, Caucasus and central Asia," Regional Office for Europe, 2013.
  5. Board of Adit and Inspection(BAI), "Weather Forecast and Earthquake Notification System Operation," International THE Board of Audit and Inspection of KOREA, 2017.
  6. J. W. Cha and J. Y. Kim, "Development of Data Mining Algorithm for Implementation of Fine Dust Numerical Prediction Model," Journal of the Korea Institute of Information and Communication Engineering, vol. 22, no. 4, pp. 595-601, Apr. 2018. https://doi.org/10.6109/JKIICE.2018.22.4.595
  7. A. Chaloulakou, G. Grivas, and N. Spyrellis, "Neural Network and Multiple Regression Models for PM10 Prediction in Athens: A Comparative Assessment," Journal of the Air & Waste Management Association, vol. 53, no. 10, pp. 1183-1190, Oct. 2003. https://doi.org/10.1080/10473289.2003.10466276
  8. K. W. Cho, Y. J. Jung, J. S. Lee, and C. H. Oh, "Separation Prediction Model by Concentration based on Deep Neural Network for Improving PM10 Forecast Accuracy," Journal of the Korea Institute of Information and Communication Engineering, vol. 24, no. 1, pp. 8-14, 2020. https://doi.org/10.6109/JKIICE.2020.24.1.8
  9. K. Kaya and S. G. Oguducu, "A Binary Classification Model for PM10 Levels," in 2018 3rd International Conference on Computer Science and Engineering (UBMK), Sarajevo, pp. 361-366, 2018.
  10. J. M. Han, J. G. Kim, and K. H. Cho, "Verify a Causal Relationship between Fine Dust and Air Condition-Weather Data in Selected Area by Contamination Factors," The journal of Bigdata, vol. 2, no. 1, pp. 17-26, Feb. 2017. https://doi.org/10.36498/kbigdt.2017.2.2.17
  11. X. Zhao, R. Zhang, J. L. Wu, and P. C. Chang, "A Deep Recurrent Neural Network for Air Quality Classification," Journal of Information Hiding and Multimedia Signal Processing, vol. 9, no. 2, pp. 346-354, Mar. 2018.
  12. B. T. Ong, S. Komei, and Z. Koji, "Dynamic Pre-training of Deep Recurrent Neural Networks for Predicting Environmental Monitoring Data," in 2014 IEEE International Conference on Big Data (Big Data), Washington DC, pp. 760-765, 2014.
  13. X. Li, L. Peng, X. Yao, S. Cui, Y.Hu, C. You, and T. chi, "Long Short-term Memory Neural Network for Air Pollutant Concentration Predictions: Method Development and Evaluation," Environmental Pollution, vol. 231, no. 1, pp. 997-1004, Dec. 2017. https://doi.org/10.1016/j.envpol.2017.08.114
  14. S. H. Jeon and Y. S. Son, "Prediction of Fine Dust PM10 using a Deep Neural Network Model," The Korean journal of applied statistics, vol. 31, no. 2, pp. 265-285, Apr. 2018. https://doi.org/10.5351/KJAS.2018.31.2.265
  15. J. R. Quinlan, "Learning Efficient Classification Procedures and Their Application to Chess End Games," in Machine Learning, Berlin, Springer, pp. 463-482, 1983.
  16. P. H. Huynh, V. H. Nguyen, and T. N. Do, "Enhancing Gene Expression Classification of Support Vector Machines with Generative Adversarial Networks," Journal of information and communication convergence engineering, vol. 17, no. 1, pp. 14-20, Mar. 2019. https://doi.org/10.6109/jicce.2019.17.1.14