DOI QR코드

DOI QR Code

On Extending the Prefix-Querying Method for Efficient Time-Series Subsequence Matching Under Time Warping

타임 워핑 하의 효율적인 시계열 서브시퀀스 매칭을 위한 접두어 질의 기법의 확장

  • 장병철 (한양대학교 정보통신학과) ;
  • 김상욱 (한양대학교 정보통신대학 정보통신학부) ;
  • 차재혁 (한양대학교 정보통신대학 정보통신학부)
  • Published : 2006.06.01

Abstract

This paper discusses the way of processing time-series subsequence matching under time warping. Time warping enables finding sequences with similar patterns even when they are of different lengths. The prefix-querying method is the first index-based approach that performs time-series subsequence matching under time warping without false dismissals. This method employs the $L_{\infty}$ as a base distance function for allowing users to issue queries conveniently. In this paper, we extend the prefix-querying method for absorbing $L_1$, which is the most-widely used as a base distance function in time-series subsequence matching under time warping, instead of $L_{\infty}$. We also formally prove that the proposed method does not incur any false dismissals in the subsequence matching. To show the superiority of our method, we conduct performance evaluation via a variety of experiments. The results reveal that our method achieves significant performance improvement in orders of magnitude compared with previous methods.

본 논문에서는 타임 워핑 하의 시계열 서브시퀀스 매칭을 처리하는 방법에 대하여 논의한다. 타임 워핑은 시퀀스의 길이가 서로 다른 경우에도 유사한 패턴을 갖는 시퀀스들을 찾을 수 있도록 해 주는 변환이다. 접두어 질의 기법(prefix-querying method)는 착오 기각 없이 타임 워핑 하의 시계열 서브시퀀스 매칭을 처리하는 인덱스를 이용한 최초의 방식이다. 이 방법은 사용자가 질의를 편리하게 작성하도록 하기 위하여 기본 거리함수로서 $L_{\infty}$를 사용한다. 본 논문에서는 $L_{\infty}$ 대신 타임 워핑 하의 시계열 서브시퀀스 매칭에서 기본 거리 함수로서 가장 널리 사용되는 $L_1$을 적용할 수 있도록 접두어 질의를 확장한다. 또한, 제안된 기법으로 타임 워핑 하의 시계열 서브시퀀스 매칭을 수행하는 경우 착오 기각(false dismissal)이 발생하지 않음을 이론적으로 증명한다. 다양한 실험을 통한 성능 평가를 통하여 본 연구에서 제시하는 기법의 우수성을 검증한다. 실험 결과에 의하면, 제안된 기법은 가장 좋은 성능을 보이는 기존의 기법과 비교하여 매우 뛰어난 성능 개선 효과를 보이는 것으로 나타났다.

Keywords

References

  1. R. Agrawal, C. Faloutsos, and A. Swami, 'Efficient Similarity Search in Sequence Databases,' In Proc. Int'l. Conf. on Foundations of Data Organization and Algorithms, FODO, pp.69-84, Oct., 1993
  2. C. Chatfield, The Analysis of Time-Series: An Introduction, Third Edition, Chapman and Hall, 1984
  3. R. Agrawal et al., 'Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time- Series Databases,' In Proc. Int'l. Conf. on Very Large Data Bases, VLDB, pp.490-501, Sept., 1995
  4. C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, 'Fast Subsequence Matching in Time-series Databases,' In Proc. Int'l. Conf. on Management of Data, ACM SIGMOD, pp.419-429, May, 1994 https://doi.org/10.1145/191839.191925
  5. M. S., Chen, J., Han, and P. S., Yu, 'Data Mining : An Overview from Database Perspective,' IEEE Trans. on Knowledge and Data Engineering, Vol.8, No.6, pp. 866-883, 1996 https://doi.org/10.1109/69.553155
  6. D. Rafiei and A. Mendelzon, 'Similarity-Based Queries for Time-Series Data,' In Proc. Int'l. Conf. on Management of Data, ACM SIGMOD, pp.13-24, 1997 https://doi.org/10.1145/253260.253264
  7. B. K. Yi and C. Faloutsos, 'Fast Time Sequence Indexing for Arbitrary Lp Norms,' In Proc. Int'l. Conf. on Very Large Data Bases, VLDB, pp.385-394, 2000
  8. K. P. Chan and A. W. C. Fu, 'Efficient Time-Series Matching by Wavelets,' In Proc. Int'l. Conf. on Data Engineering, IEEE ICDE, pp.126-133, 1999 https://doi.org/10.1109/ICDE.1999.754915
  9. K. K. W. Chu and M. H. Wong, 'Fast Time-Series Searching with Scaling and Shifting,' In Proc. Int'l. Symp. on Principles of Database Systems, ACM PODS, pp.237-248, May, 1999 https://doi.org/10.1145/303976.304000
  10. D. Q. Goldin and P. C. Kanellakis, 'On Similarity Queries for Time-Series Data: Constraint Specification and Implementation,' In Proc. Int'l. Conf. on Principles and Practice of Constraint Programming, CP, pp.137-153, Sept., 1995
  11. D. Rafiei, 'On Similarity-Based Queries for Time Series Data,' In Proc. Int'l. Conf. on Data Engineering, IEEE ICDE, pp.410-417, 1999 https://doi.org/10.1109/ICDE.1999.754957
  12. G. Das, D. Gunopulos, and H. Mannila, 'Finding Similar Time Series,' In Proc. European Symp. on Principles of Data Mining and Knowledge Discovery, PKDD, pp.88-100, 1997
  13. W. K. Loh, S. W. Kim, and K. Y. Whang, 'Index Interpolation: An Approach for Subsequence Matching Supporting Normalization Transform in Time- Series Databases,' In Proc. ACM Int'l. Conf. on Information and Knowledge Management, ACM CIKM, pp.480-487, 2000
  14. W. K. Loh, S. W. Kim, and K. Y. Whang, 'Index Interpolation: A Subsequence Matching Algorithm Supporting Moving Average Transform of Arbitrary Order in Time-Series Databases,' IEICE Trans. on Information and Systems, Vol.E84-D, No.1, pp.76-86, 2001
  15. D. J. Berndt and J. Clifford, 'Finding Patterns in Time Series : A Dynamic Programming Approach,' Advances in Knowledge Discovery and Data Mining, pp.229-248, 1996
  16. B. K. Yi, H. V. Jagadish, and C. Faloutsos, 'Efficient Retrieval of Similar Time Sequences Under Time Warping,' In Proc. Int'l. Conf. on Data Engineering, IEEE ICDE, pp.201-208, 1998 https://doi.org/10.1109/ICDE.1998.655778
  17. S. H. Park et al., 'Efficient Searches for Similar Subsequences of Difference Lengths in Sequence Databases,' In Proc. Int'l. Conf. on Data Engineering, IEEE ICDE, pp.23-32, 2000 https://doi.org/10.1109/ICDE.2000.839384
  18. S. W. Kim, S. H. Park, and W. W. Chu, 'An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases,' In Proc. Int'l. Conf. on Data Engineering, IEEE ICDE, pp.607-614, 2001 https://doi.org/10.1109/ICDE.2001.914875
  19. S. H. Park, S. W. Kim, J. S. Cho, and S. Padmanabhan, 'Prefix-Querying: An Approach for Effective Subsequence Matching Under Time Warping in Sequence Databases,' In Proc. ACM Int'l. Conf. on Information and Knowledge Management, ACM CIKM, pp.255-262, 2001
  20. L. Rabiner and H. H. Juang, Fundamentals of Speech Recognition, Prentice Hall, 1993
  21. Sang-Wook Kim, Sang-Hyun Park, and Wesley W. Chu, 'Efficient Processing of Similarity Search Under Time-Warping in Sequence Databases: An Index- Based Approach,' Information Systems, Vol.29, No.5, pp.405-420, Jul., 2004 https://doi.org/10.1016/S0306-4379(03)00037-1
  22. C. Faloutsos, private communication, 2001
  23. Man-Soon Kim, Sang-Wook Kim, and Mi-Young Shin, 'Optimization of Subsequence Matching Under Time Warping in Time-Series Databases,' ACM Symp. on Applied Computing, pp.581-586, Apr., 2005 https://doi.org/10.1145/1066677.1066814
  24. G. A. Stephen, String Searching Algorithms, World Scientific Publishing, 1994
  25. E. Keogh, 'Exact Indexing of Dynamic Time Warping', In Proc. Int'l. of the 28th VLDB Conference, 2002