DOI QR코드

DOI QR Code

Time Series Representation Combining PIPs Detection and Persist Discretization Techniques for Time Series Classification

시계열 분류를 위한 PIPs 탐지와 Persist 이산화 기법들을 결합한 시계열 표현

  • 박상호 (인하대학교 컴퓨터 정보공학부) ;
  • 이주홍 (인하대학교 컴퓨터 정보공학부)
  • Received : 2010.08.31
  • Accepted : 2010.09.16
  • Published : 2010.09.28

Abstract

Various time series representation methods have been suggested in order to process time series data efficiently and effectively. SAX is the representative time series representation method combining segmentation and discretization techniques, which has been successfully applied to the time series classification task. But SAX requires a large number of segments in order to represent the meaningful dynamic patterns of time series accurately, since it loss the dynamic property of time series in the course of smoothing the movement of time series. Therefore, this paper suggests a new time series representation method that combines PIPs detection and Persist discretization techniques. The suggested method represents the dynamic movement of high-diemensional time series in a lower dimensional space by detecting PIPs indicating the important inflection points of time series. And it determines the optimal discretizaton ranges by applying self-transition and marginal probabilities distributions to KL divergence measure. It minimizes the information loss in process of the dimensionality reduction. The suggested method enhances the performance of time series classification task by minimizing the information loss in the course of dimensionality reduction.

Keywords

Time Series Representation;PIPs Detection;KL Divergence

References

  1. B-K Yi and C. Faloutsos, “Fast Time Sequence Indexing for Arbitrary Lp Norms”, Proceedings of the VLDB, Cairo, Egypt, 2000(9).
  2. C. L. Blake and C. J. Merz, UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/~mlearn/MLRepository.html, UC Irvine, Dept. Information and Computer Science, 1998.
  3. E. Keogh, K. Chakrabarti and M. Pazzani, S. Mehrotra, "Dimensionality reduction for fast similarity search in large time series databases," Journal of Knowledge and Information Systems, Vol.3, No.3, pp.263-286, 2001. https://doi.org/10.1007/PL00011669
  4. E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra, “Locally adaptive dimensionality reduction for indexing large time series databases," In proceedings of ACM SIGMOD Conference on Management of Data. Santa Barbara, CA, May 21-24, pp.151-162, 2001. https://doi.org/10.1145/376284.375680
  5. F. E. H. Tay and L. Cao, "Application of support vector machine in financial time series forecasting," Omega 29, pp.309-317, 2001. https://doi.org/10.1016/S0305-0483(01)00026-3
  6. J. Carlos, G. Alonso and J. R. Juan, "A graphical rule language for continuous dynamic systems," In Computational Intelligence for Modelling, Control and Automation. Masoud Mohammadian, Ed., Amsterdam, Netherlands, CIMCA-99, pp.482-487, IOS Press, 1999.
  7. J. Lin, E. Keogh, L. Wei, and S. Lonardi, "Experiencing SAX: A novel symbolic representation of time series," Data Mining and Knowledge Discovery, Vol.15, No.2, 2007. https://doi.org/10.1007/s10618-007-0064-z
  8. J. R. Quinlan, C4.5 : Programs for Machine Learning, Morgan Kaufmann Pub, LosAltos, Califoormia, 1993.
  9. K. Chan and W. Fu, "Efficient time series matching by wavelets," Proceedings of the 15th IEEE International Conference on Data Engineering, 1999.
  10. K. J. Kim, "Financial time series forecasting using support vector machines," Neurocomputing, Vol.55, pp.307-319, 2003. https://doi.org/10.1016/S0925-2312(03)00372-2
  11. M. Fabian and U. Alfred, “Optimizing Time Series Discretization for Knowledge Discovery,” ACM SIGKDD, pp.660-665, 2005. https://doi.org/10.1145/1081870.1081953
  12. M. Kubat, I. Koprinska, and G. Pfurtscheller, "Learning to classify biomedical signals", In Machine Learning and Data Mining, R.S. Michalski ,I.Bratko, M.Kubat, Eds., pp.409-428, John Wiley & Sons, 1998.
  13. R. Agrawal, C. Faloutsos, and A. Swami, "Efficient similarity search in sequence databases," Proceedings of the 4th Conference on Foundations of Data Organization and Algorithms, 1993. https://doi.org/10.1007/3-540-57301-1_5
  14. T. Fu, T-c. Fu, F. I. Chung, V Ng, and R. Luk, "Pattern Discovery from Stock Time Series Using Self-Organizing Maps," Notes KDD 2001 Workshop Temporal Data Mining, pp.27-37, 2001.
  15. U. M. Fayyad and K. B. Irani, "Multi-Interval Discretization of continuous-valued Attributes for Classification Learning," Proc. 13th Int'l Joint Conference of Artificial Intelligence, pp.1022-1027, 1993.