DOI QR코드

DOI QR Code

Machine Learning-based Quality Control and Error Correction Using Homogeneous Temporal Data Collected by IoT Sensors

IoT센서로 수집된 균질 시간 데이터를 이용한 기계학습 기반의 품질관리 및 데이터 보정

  • 김혜진 (광운대학교 컴퓨터과학과) ;
  • 이현수 ((주)주빅스 기술연구소) ;
  • 최병진 ((주)주빅스 기술연구소) ;
  • 김용혁 (광운대학교 컴퓨터과학과)
  • Received : 2019.01.28
  • Accepted : 2019.04.20
  • Published : 2019.04.28

Abstract

In this paper, quality control (QC) is applied to each meteorological element of weather data collected from seven IoT sensors such as temperature. In addition, we propose a method for estimating the data regarded as error by means of machine learning. The collected meteorological data was linearly interpolated based on the basic QC results, and then machine learning-based QC was performed. Support vector regression, decision table, and multilayer perceptron were used as machine learning techniques. We confirmed that the mean absolute error (MAE) of the machine learning models through the basic QC is 21% lower than that of models without basic QC. In addition, when the support vector regression model was compared with other machine learning methods, it was found that the MAE is 24% lower than that of the multilayer neural network and 58% lower than that of the decision table on average.

본 논문은 온도 등 7 가지의 IoT 센서에서 수집된 기상데이터의 각 기상요소에 대하여 품질관리(Quality Control; QC)를 하였다. 또한, 우리는 측정된 값에 오류가 있는 데이터를 기계학습으로 의미있게 추정하는 방법을 제안한다. 수집된 기상데이터를 기본 QC 결과를 바탕으로 오류 데이터를 선형 보간하여 기계학습 QC를 진행하였으며, 기계학습 기법으로는 대표적인 서포트벡터회귀, 의사결정테이블, 다층퍼셉트론을 사용했다. 기본 QC의 적용 유무에 따라 비교해 보았을 때, 우리는 기본 QC를 거쳐 보간한 기계학습 모델들의 평균절대오차(MAE)가 21% 낮은 것을 확인할 수 있었다. 또한, 기계학습 기법에 따라 비교하여 서포트벡터회귀 모델을 적용하였을 때가, 모든 기상 요소에 대하여 MAE가 평균적으로 다층신경망은 24%, 의사결정테이블은 58% 낮은 것을 알 수 있었다.

Keywords

OHHGBW_2019_v10n4_17_f0001.png 이미지

Fig. 1. Location of the place where IoT sensors are installed (Latitude: 37.708, Longitude:126.895)

OHHGBW_2019_v10n4_17_f0002.png 이미지

Fig. 2. Flowchart of quality control using machine learning

OHHGBW_2019_v10n4_17_f0003.png 이미지

Fig. 3. Raw data of humidity (data regarded ed as error in the basic QC are marked on the X-axis)

OHHGBW_2019_v10n4_17_f0004.png 이미지

Fig. 4. Graph of humidity data corrected using support vector regression

Table 1. Information on sensors collecting weather data

OHHGBW_2019_v10n4_17_t0002.png 이미지

Table 2. Details of basic quality control

OHHGBW_2019_v10n4_17_t0003.png 이미지

Table 3. Error rate of basic quality control (%)

OHHGBW_2019_v10n4_17_t0004.png 이미지

Table 4. Machine learning-based QC and error correction results on raw data

OHHGBW_2019_v10n4_17_t0005.png 이미지

Table 5. Machine learning-based QC and error correction results on non-interpolated data after basic QCs

OHHGBW_2019_v10n4_17_t0006.png 이미지

Table 6. Machine learning-based QC and error correction results on interpolated data after basic QCs

OHHGBW_2019_v10n4_17_t0007.png 이미지

References

  1. N.-Y. Kim, Y.-H. Kim, Y. Yoon, H.-H. Im, R. K. Y. Choi, and Y. H. Lee. (2015). Correcting air-pressure data collected by MEMS sensors in smartphones. Journal of Sensors, Article ID 245498.
  2. M.-K. Lee, S.-H. Moon, Y. Yoon , Y.-H. Kim, and B.-R. Moon. (2018). Detecting anomalies in meteorological data using support vector regression. Advances in Meteorology, Article ID 5439256.
  3. J.-H. Ha, Y.-H. Kim, H.-H. Im, N.-Y. Kim, S. Sim, and Y. Yoon. (2018). Error correction of meteorological data obtained with Mini-AWSs based on machine learning. Advances in Meteorology, Article ID 7210137.
  4. Y.-H. Kim, J.-H. Ha, Y. Yoon, N.-Y. Kim, H.-H. Im, S. Sim, and R. K. Y. Choi. (2016). Improved correction of atmospheric pressure data obtained by smartphones through machine learning. Computational Intelligence and Neuroscience, Article ID 9467878.
  5. M.-K. Lee, S.-H. Moon, Y.-H. Kim, and B.-R. Moon. (2014. October). Correcting abnormalities in meteorological data by machine learning. IEEE International Conference on Systems, Man, and Cybernetics. (pp.888-893). San Diego : IEEE
  6. G.-D. Kim & Y.-H. Kim. (2018). Correction of drifter data using recurrent neural networks. Journal of the Korea Convergence Society, 9(3), 15-21. https://doi.org/10.15207/JKCS.2018.9.3.015
  7. A. J. Smola & B. Scholkopf. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199-222. https://doi.org/10.1023/B:STCO.0000035301.49549.88
  8. U. W. Pooch. (1974). Translation of decision tables. ACM Computing Surveys, 6(2), 125-151. https://doi.org/10.1145/356628.356630
  9. F. Rosenblatt (1961). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms, Washington DC : Spartan Books.
  10. J. A. Suykens & J. Vandewalle. (1999). Least squares support vector machine classifiers. Neural Processing Letters, 9(3), 293-300. https://doi.org/10.1023/A:1018628609742
  11. N. R. Draper & H. Smith. (1998). Applied Regression Analysis, Thirds Edition.Wiley.
  12. M. Riedmiller & H. Braun. (1993). A direct adaptive method for faster backpropagation learning: the RPROP algorithm. IEEE International Conference on Neural Networks.. (pp.586-591).
  13. R. Kohavi. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. International Joint Conference on Artificial Intelligence Organization, 14(2), 1137-1145. San Francisco : Morgan Kaufmann.
  14. E. Frank, M. A. Hall, and I. H. Witten. (2016). Data Mining: Practical Machine Learning Tools and Techniques, Fourth Edition. Morgan Kaufmann.
  15. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. (2009). The WEKA data mining software: an update. Newsletter of SIGKDD Explorations, 11(1), 10-18. https://doi.org/10.1145/1656274.1656278