DOI QR코드

DOI QR Code

Temporal Data Mining Framework

시간 데이타마이닝 프레임워크

  • Published : 2002.06.01

Abstract

Temporal data mining, the incorporation of temporal semantics to existing data mining techniques, refers to a set of techniques for discovering implicit and useful temporal knowledge from large quantities of temporal data. Temporal knowledge, expressible in the form of rules, is knowledge with temporal semantics and relationships, such as cyclic pattern, calendric pattern, trends, etc. There are many examples of temporal data, including patient histories, purchaser histories, and web log that it can discover useful temporal knowledge from. Many studies on data mining have been pursued and some of them have involved issues of temporal data mining for discovering temporal knowledge from temporal data, such as sequential pattern, similar time sequence, cyclic and temporal association rules, etc. However, all of the works treated data in database at best as data series in chronological order and did not consider temporal semantics and temporal relationships containing data. In order to solve this problem, we propose a theoretical framework for temporal data mining. This paper surveys the work to date and explores the issues involved in temporal data mining. We then define a model for temporal data mining and suggest SQL-like mining language with ability to express the task of temporal mining and show architecture of temporal mining system.

시간 데이타마이닝은 기존 데이타마이닝에 시간 개념을 추가하여 "시간값을 가진 대용량 데이타로부터 이전에 잘 알려지지는 않았지만, 묵시적이고 잠재적으로 유용한 시간 지식을 탐사하는 기술"로 정의된다. 시간 지식이란 주기적 패턴, 캘린더 패턴, 경향 등과 같이 시간 의미와 시간 관계를 가진 지식을 말한다. 실세계에서는 환자의 병력, 상품 구매 이력, 웹 로그 등과 같은 다양한 시간 데이타가 존재하며 이로부터 여러 형태의 유용한 시간 지식을 찾아낼 수 있다. 데이타마이닝에 대한 연구가 진행되면서 순차 패턴, 유사 시계열 탐사, 주기적 연관규칙 탐사 등과 같이 시간 지식을 탐사하고자 하는 시간 데이타마이닝에 대한 부분적인 연구가 수행되었다. 그러나 기존 연구는 단순히 데이타의 발생 순서 및 유사한 패턴을 찾아내는데 중점을 두고 있어 데이타가 포함하고 있는 시간 의미와 시간 관계를 탐사하는데 부족하며, 시간 지식의 전체적인 측면보다는 연관 규칙과 같은 일부분만을 다루고 있다는 문제점을 가지고 있다. 따라서 이 논문에서는 시간 데이타마이닝에 대한 체계적인 연구를 위하여 시간 데이타마이닝에 대한 기존 연구 내용과 해결해야 할 문제점을 분석하고 이를 바탕으로 전체적인 프레임워크를 제시하였다. 또한 그 구현 방안 및 적용평가를 수행하였다. 프레임워크에서는 시간 데이타마이닝 모델을 제안하고, 이를 바탕으로 시간 데이타마이닝 질의어와 시간 지식을 탐사할 수 있는 시간 데이타마이닝 시스템을 설계하였다.

Keywords

References

  1. 이정무, Introduction to Data Mining with SQL Server 2000, Microsoft Tech-Ed 2000, 2000
  2. R. Agrawal, Tomasz Imielinski, and Arun Swami, 'Data-base mining : A performance perspective,' IEEE Transac-tions on Knowledge and Data Engineering, Vol.5, No.6, December, 1993 https://doi.org/10.1109/69.250074
  3. R. Agrawal and R. Srikant, 'Fast algorithms for mining association rules,' the VLDB Conference, Santiago, Chile, September, 1994
  4. R. Agrawal, G. Psaila, E. Wimmers, M. Zaot, 'Querying shapes of histories,' In Proc. Twenty-first International Conference on Very Large Databases, Zurich, Switzerland, 1995
  5. R. Agrawal and R. Srikant, 'Mining sequential patterns,' In Proc. Eleventh International Conference on Data Engin-eering, 1995
  6. R. Agrawal, King-Ip Lin, Harpreet S. Sawhney, and Kyu-seok Shim, 'Fast similarity search in the presence of noise, scaling, and translation in time series databases,' the VLDB Conference, Zurich, Switzerland. Sept., 1995
  7. J. M. Ale, G. H. Rossi, 'An Approach to Discovering Temporal Association Rules,' SAC'00, Italy, Mar., 2000 https://doi.org/10.1145/335603.335770
  8. J. Allen, 'Maintaining Knowledge about Temporal Inter-vals,' Comm. Of the ACM, Vol.26, No.11, Nov., 1983 https://doi.org/10.1145/182.358434
  9. G. Berger and A. Tuzhilin, 'Discovering unexpected patte-rns in temporal data using temporal logic,' Temporal Data-bases-Research and Practice, Springer-Verlag, 1998
  10. R. L. Blum, 'Discovery, Conrmation and Incorporation of Causal Relationships from a Large Time-Oriented Clinical Database : The RX Project,' Computers and Biomedical Research, 1982
  11. S. Chakrabarti, S. Sarawagi, and B. Dom., 'Mining surpris-ing patterns using temporal description length,' In Proc. Twenty-Fourth International Conference on Very Large databases, 1998
  12. R. Chandra, Arie Segev, and M. Stonebraker, 'Imple-menting Calendars and Temporal Rules in Next Generation Databases,' ICDE '94, pp.264-273, Houston, Texas, 1994
  13. X. Chen and I. Petrounias, 'A framework for temporal data mining,' In Proc. Ninth International Conference on Data-base and Expert Systems Applications, DEXA'98, 1998
  14. X. Chen, I. Petrounias and H. Heathfield, 'Discovering tem-poral association rules in temporal databases,' In Proc. In-ternational Workshop on Issues and Applications of Data-base Technology, 1998
  15. D. W. Cheung, Han, J., Ng, V. T. and Wong, C.Y., 'Main-tenance of discovered association rules in large databases : An incremental updating technique,' ICDE '96, 1996
  16. D. W. Cheung, S. D. Lee, and Kao B., 'A general incre-mental technique for maintaining discovered association rules,' DASFAA '97, Melbourne, Australia, Apr., 1997
  17. C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, 'Fast subsequence matching in time-series databases,' In Proc. ACM SIGMOD Conference on the Management of Data, Minneapolis, USA. 1994 https://doi.org/10.1145/191843.191925
  18. Minos N. Garofalakis, Rajeev Rastogi and Kyuseok Shim, 'SPIRIT : Sequential Pattern Miming with Regular Expres-sion Constraints,' the VLDB Conference, Edinburgh, Scot-land, UK, 1999
  19. J. Han, Y. Fu, W. Wang, K. Koperski and O. Zaiane, 'DMQL : A Data Mining Query Language for Relational Databa-ses,' Reserach Report, DB Research Laboratory, Simon Fraser University, 1994
  20. J. Han, G. Dong, and Y. Yin 'Efficient Mining of Partial Periodic Patterns in Time Series Database,' In Proc. Fifte-enth International Conference on Data Engineering, Sydney, Australia, 1999
  21. C. S. Jensen, et al, 'A Consensus Glossary of Temporal Database Concepts,' ACM SIGMOD Record, Vol.23, No.1, 1994 https://doi.org/10.1145/181550.181560
  22. M. A. King, J. F. Elde V, 'Evaluation of Fourteen Desktop Data Mining Tools,' IEEE, 1998 https://doi.org/10.1109/ICSMC.1998.725108
  23. J. Y. Lee, K. J. Oh, K. H. Ryu, 'Integration with Spatiotem-poral Relationship Operators in SQL,' ACM-GIS, pp.165-167, 1998 https://doi.org/10.1145/288692.288724
  24. H. Mannila and H. Toivonen, 'Discovering generalized epi-sodes using minimal occurrences,' In Second International Conference on Knowledge Discovery and Data Mining(KDD'96), pp.146-151, 1996
  25. H. Mannila, H. Toivonen, and A. I. Verkamo, 'Discovery of frequent episodes in event sequences,' Data Mining and Knowledge Discovery, 1(3), pp.259-289, November, 1997 https://doi.org/10.1023/A:1009748302351
  26. K. W. Nam, D. H. Kim K. H. Ryu, 'The Spatiotemporal Relationship Operator,' ITC-CSCC, pp.1035-1038, 1996
  27. B. Ozden, S. Ramanwamy, and A. Silberschatz, 'Cyclic as-sociation rules,' Int'l Conference on Data Engineering Orlando, 1998 https://doi.org/10.1109/ICDE.1998.655804
  28. C. Rainsford and J. F. Roddick, 'Temporal data mining in information systems : a model,' In Proc. Seventh Austra-lasian Conference on Information Systems, 1996
  29. C. Rainsford, Accommodating Temporal Semantics in Know-ledge Discovery and Data Mining, PhD Thesis, University of South Australia, 1998
  30. S. Ramaswamy, S. Mahajan and A. Silberschatz, 'On the discovery of interesting patterns in association rules,' the VLDB Conference, New York City, September 1998
  31. S. Ramaswamy, Rajeev Rastogi and Kyuseok Shim, 'Ef-ficient algorithms for mining outliers from large data sets,' the ACM SIGMOD Conference on Management of Data, Dallas, TX, May, 2000 https://doi.org/10.1145/335191.335437
  32. J. F. Roddick, K. Hornsby and M. Spiliopoulou, 'Temporal, Spatial and Spatio-Temporal Data Mining and Knowledge Discovery Research Bibliography,' http : //www.cs.flinders.edu.au, 2000
  33. J. F. Roddick and M. Spiliopoulou, 'Temporal data mining : survey and issues,' Research Report ACRC-99-007, Uni-versity of South Australia, 1999
  34. M. H. Saraee and B. Theodoulidis, 'Knowledge discovery in temporal databases,' In Proc. IEE Colloquium on Know-ledge Discovery in Databases, 1995
  35. Kyuseok Shim, R. Srikant and R. Agrawal, 'High-dimen-sional similarity joins,' the 13th International Conference on IEEE Data Engineering, 1997 https://doi.org/10.1109/69.979979
  36. Kyuseok Shim, Data Mining(Where are we heading for?), Data Mining Tutorial, 2000
  37. A. Silberschatz and A. Tuzhilin, 'What makes patterns interesting in knowledge discovery systems,' IEEE Trans. on Knowledge and Data Eng., 8(6), pp.970-974, Dec., 1996 https://doi.org/10.1109/69.553165
  38. R. Snodgrass, 'The Temporal Query Language TQuel,' ACM TODS, Vol.12, No.2, Jun., 1987 https://doi.org/10.1145/588011.588041
  39. R. Srikant and R. Agrawal, 'Mining sequential patterns : generalisations and performance improvements,' In Proc. International Conference on Extending Database Techno-logy, Avignon, France, Springer-Verlag, 1996
  40. T. Wade, D. Byrns, P. J., Steiner, J. F. and Bondy, J., 'Fin-ding temporal patterns- A set based approach. Articial In-telligence in Medicine,' pp.263-271. 1994
  41. S. Ye and J. A. Keane, 'Mining association rules in temporal databases,' In Proc. International Conference on Systems, Man and Cybernetics, 1998 https://doi.org/10.1109/ICSMC.1998.725086
  42. D. H. Kim, K. H. Ryu, H. S. Kim, 'A Spatiotemporal data-base model and query language,' The Journal of Systems and Software. Vol.5, 2000 https://doi.org/10.1016/S0164-1212(00)00066-2
  43. J. S. Song, Y. J. Lee, and K. H. Ryu, 'Discovering Temporal Relation Rules from Interval Data,' submitted to the ETRI Journal, 2001
  44. K. J. Jeong, Maintenance of Materialized View in Temporal Query Processing System, Ph.D Dissertation, Dept. of Computer Science, Chungbuk National University, 1998
  45. S. N. Kim, Transforming Entity-Relationship into Object-Oriented Model in Temporal Paradigm, Ph.D Dissertation, Dept. of Computer Science, Chungbuk National University, 1997
  46. Y. J. Lee, A Data Mining Technique for Discovering Tem-poral Relation Rules, Ph.D Dissertation, Dept. of Computer Science, Chungbuk National University, 1997
  47. K. H. Ryu, J. W. Lee, Y. J. Lee, 'Temporal Data Mining for eCRM,' Database Research of Korean Information Science Society SIGDB, Vol.17, No.1, 2001