DOI QR코드

DOI QR Code

Approximation of Frequent Itemsets with Maximum Size by One-scan for Association Rule Mining Application

연관 규칙 탐사 응용을 위한 한 번 읽기에 의한 최대 크기 빈발항목 추정기법

  • 한갑수 (두원공과대학 컴퓨터정보과)
  • Published : 2008.08.29

Abstract

Nowadays, lots of data mining applications based on continuous and online real time are increasing by the rapid growth of the data processing technique. In order to do association rule mining in that application, we have to use new techniques to find the frequent itemsets. Most of the existing techniques to find the frequent itemsets should scan the total database repeatedly. But in the application based on the continuous and online real time, it is impossible to scan the total database repeatedly. We have to find the frequent itemsets with only one scan of the data interval for that kind of application. So in this paper we propose an approximation technique which finds the maximum size of the frequent itemsets and items included in the maximum size of the frequent itemsets for the processing of association rule mining.

최근에는 데이터를 획득 및 처리하는 방법의 향상으로 인하여 연속적이고 실시간으로 발생되는 데이터를 처리하는 응용이 증가하고 있다. 그러한 응용에서 연관규칙을 추출하기 위해서는 새로운 방식을 사용하여 빈발항목집합을 찾아내야 한다. 기존의 빈발항목을 발견하는 방식에서는 전체 데이터베이스를 반복적으로 읽으면서 처리해야 한다. 그러나 실시간이고 연속적으로 발생하는 데이터를 처리하는 응용에서는 반복적으로 여러 번 데이터를 읽을 수 없기 때문에 일정 구간의 데이터를 한 번만 읽고 처리해야 한다. 따라서 본 논문에서는 입력되는 데이터 구간을 한 번만 읽고 최대 빈발항목 집합의 크기와 해당 빈발항목을 추정함으로써 필요한 연관규칙탐사를 가능하게 하는 빈발항목 추정 기법을 제안한다.

Keywords

References

  1. R. Agrawal, T. Imielinski, and A. Swami, “Mining association Rules between sets of items in large databases,” In Proceedings of the ACM SIGMOD international Conference on Management of Data, Washington, DC, pp.207-216, May, 1993 https://doi.org/10.1145/170035.170072
  2. R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,” In Proceedings of the $20^{th}$ VLDB Conference, Santiago, Chile, Sept., 1994
  3. Ashok Savasere, Edward Omiecinski, Shamkant Navathe, “An Efficient Algorithm for Mining Association Rules in Large Databases,” In Proceeding of the $21^{st}$ VLDB Conference, Zurich, Swizerland, 1995
  4. David W. Cheung, Kan Hu, Shaowei Xia, “Asynchronous Parallel Algorithm for Mining Association Rules on a Shared-memory Multi-processors,” SPAA98, Puerto, Vallata, Mexico, 1998 https://doi.org/10.1145/277651.277694
  5. R. Agrawal, J. C. Shafer, “Parallel Mining of Association Rules,” IEEE Transaction On Knowledge And Data Engineering, Vol.8, pp.962-969, 1996 https://doi.org/10.1109/69.553164
  6. Nan Jiang, Le Gruenwald, “Research Issues in Data Stream Association Rule Mining,” Sigmod Record, Vol.35, No.1, Mar., 2006 https://doi.org/10.1145/1121995.1121998
  7. Jiawei Han, Jian Pei, Yiwen Yin, Runying Mao, “Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach,” Data Mining and Knowledge Discovery, Vol.8, pp.53-87, 2004 https://doi.org/10.1023/B:DAMI.0000005258.31418.83
  8. J.H Chang and W.S.Lee, “Finding Recent Frequent Itemsets Adaptively over Online Data Streams,” In Proceedings of the SIGKDD'03, Washington, DC, USA, August 24-27, 2003 https://doi.org/10.1145/956750.956807
  9. Hao Huang, Xindong Wu, Rechard Relue, “Association Analysis with One Scan of Databases,” IEEE int'l Conf on Data Mining, December, 2002 https://doi.org/10.1109/ICDM.2002.1184015
  10. James Cheng, Yiping Ke, and Wilfred Ng, “Maintaining Frequent Itemsets over High-Speed Data Streams,” In Proceedings of the PAKDD2006, April 9-12, Singapore, 2006 https://doi.org/10.1007/11731139_53
  11. Gab-Soo Han, ”The Research for Efficient Candidate generation in Association Rule Data Mining,” Journal of Doowon Technical college, pp.221-228, December, 2005
  12. W.W.Armstrong, “Dependency Structures of Data base Relationships,”In Proceedings of the 1974 IFIP Congress, pp.580-583, 1974
  13. Jiawei Han, Micheline Kamber, “Data Mining Concepts and Techniques,” Morgan Kaufmann Publishers, pp.242, 2001
  14. Yin-Ling Cheung and Ada Wai-Chee Fu, “Mining Frequent Itemsets without Support Threshold: With and without Item Constraints,” IEEE Transactions on Knowledge and Data Engineering, Vol.16, No.9, September, 2004 https://doi.org/10.1109/TKDE.2004.44

Cited by

  1. Utilizing the Effect of Market Basket Size for Improving the Practicality of Association Rule Measures vol.17D, pp.1, 2010, https://doi.org/10.3745/KIPSTD.2010.17D.1.001