DOI QR코드

DOI QR Code

Anomaly Detection of Big Time Series Data Using Machine Learning

머신러닝 기법을 활용한 대용량 시계열 데이터 이상 시점탐지 방법론 : 발전기 부품신호 사례 중심

  • Received : 2020.03.04
  • Accepted : 2020.05.04
  • Published : 2020.06.30

Abstract

Anomaly detection of Machine Learning such as PCA anomaly detection and CNN image classification has been focused on cross-sectional data. In this paper, two approaches has been suggested to apply ML techniques for identifying the failure time of big time series data. PCA anomaly detection to identify time rows as normal or abnormal was suggested by converting subjects identification problem to time domain. CNN image classification was suggested to identify the failure time by re-structuring of time series data, which computed the correlation matrix of one minute data and converted to tiff image format. Also, LASSO, one of feature selection methods, was applied to select the most affecting variables which could identify the failure status. For the empirical study, time series data was collected in seconds from a power generator of 214 components for 25 minutes including 20 minutes before the failure time. The failure time was predicted and detected 9 minutes 17 seconds before the failure time by PCA anomaly detection, but was not detected by the combination of LASSO and PCA because the target variable was binary variable which was assigned on the base of the failure time. CNN image classification with the train data of 10 normal status image and 5 failure status images detected just one minute before.

Keywords

References

  1. Ben-Hur, A., Horn, D., Siegelmann, H., and Vapnik, V.N., Support vector clustering, Journal of Machine Learning Research, 2011, Vol. 2, pp. 125-137.
  2. Bianco, A.M., Garcia, B.M., Martinez, E.J., and Yohai, V.J., Detection in regression models with arima errors using estimates, Journal of Forecasting, 2001, Vol. 20, No. 8, pp. 565-579. https://doi.org/10.1002/for.768
  3. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y., Generative Adversarial Networks, Proceedings of the International Conference on Neural Information Proceeding Systems, 2014, pp. 2672-2680.
  4. Gulli, A. and Pal, S., Deep Learning with Keras : Implementing deep learning models and neural networks with the power of Python, Packt Publishing, 2017.
  5. James, A.A., An Introduction To Neural Networks, MIT Press, 1995.
  6. Keras 2.3.1, https://pypi.org/project/Keras/, 2019.
  7. Kwon, S.H. and Oh, H.S., Construction of observational locations for measuring water quality in the river area, Journal of Society of Korea Industrial and Systems Engineering, 2012, Vol. 35, No. 3 pp. 187-191.
  8. Kwon, S.H. and Oh, H.S., Short-term forecasting of power demand, Journal of Society of Korea Industrial and Systems Engineering, 2015, Vol. 38, No. 1, pp. 110-117. https://doi.org/10.11627/jkise.2014.38.1.110
  9. Kwon, S.H., Data driven approach to forecast water turnover, Journal of Society of Korea Industrial and Systems Engineering, 2018, Vol. 41, No. 3, pp. 90-96. https://doi.org/10.11627/jkise.2018.41.3.090
  10. Lee, J.U., Jeon, H.S., and Kwon, D.I., Foreign and domestic research trend in failure detection methodology, Journal of the KSME, 2016, Vol. 56, No. 11, pp. 37-40.
  11. Sharma, N., Jain, V., and Mishra, A., An analysis of convolutional neural networks for image classification, Procedia Computer Science, 2018, Vol. 132, pp. 377-384. https://doi.org/10.1016/j.procs.2018.05.198
  12. Shyu, M., Chen, S., Sarinnapakorn, K., and Chang, L., A novelanomaly detection scheme based on principal component classifier, CDM Foundation and New Direction of Data Mining workshop, 2003, pp. 172-179.
  13. Tibshirani, R., Regression shrinkage and selection via the LASSO, Journal of the Royal Statistical Society, Series B(methodological), 1996, Vol. 58, No. 1, pp. 267-88. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  14. Tsay, R.S., Pena, D., and Pankratz, A.E., Outliers in multivariate time series, Biometrika, 2000, Vol. 87, No. 4, pp. 789-804. https://doi.org/10.1093/biomet/87.4.789