Mining of Multi-dimensional Association Rules over Interval Data using Clustering and Characterization

클러스터링과 특성분석을 이용한 구간 데이터에서 다차원 연관 규칙 마이닝

  • 임승환 (한양대학교 전자컴퓨터통신공학과) ;
  • 권용석 (삼성전자 무선연구소) ;
  • 김상욱 (한양대학교 전자컴퓨터통신공학과)
  • Published : 2010.01.15

Abstract

To discover association rules from nontransactional data, there have been many studies on discretization of attribute values. These studies do not reflect the change of discovered rules' confidence according to the change of the ranges of the discretized attributes, and perform the discretization stage and the rule discovery stage independently. This causes the ranges of attributes not properly discretized, thereby making the rules having high confidence excluded in the result set. To solve this problem, we propose a novel method that performs the discretization and rule discovery stages simultaneously in order to discretize ranges of attributes in such a way that the rules having high confidence are discovered well. To the end, we perform hierarchical clustering on the attributes in the right hand side of rules, then do characterization on every cluster thus obtained. The experimental result demonstrates that our method discovers the rules having high confidence better than existing methods.

References

  1. B. Lent, A. Swami, and J. Widom, "Clustering Association Rules," In Proc. IEEE Int'l. Conf. on Data Engineering, IEEE ICDE, pp.220-231, 1997.
  2. R. J. Miller and Y. Yang, "Association Rules Over Interval Data," In Proc. ACM Int'l. Conf. on Management of Data, ACM SIGMOD, pp.452-461, 1997.
  3. R. Povinelli, Identifying Temporal Patterns for Characterization and Prediction of Financial Time Series Events, Springer Berlin, 2001.
  4. M. Kamber, J. Han, and J. Chiang, "Metarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes," In Proc. ACM Int'l. Conf. on Knowledge Discovery and Data Mining, ACM SIGKDD, pp.207-210, 1997.
  5. T. Zhang, R. Ramakrishnan, and M. Livny, "BIRCH: An Efficient Data Clustering Method for Very Large Databases," In Proc. ACM Int'l. Conf. on Management of Data, ACM SIGMOD, pp. 103-114, 1996.
  6. T. Zhang, R. Ramakrishnan, and M. Livny, "Data Clustering System BIRCH and Its Applications," Data Mining and Knowledge Discovery, vol.1, no.2, pp.141-182, 1997. https://doi.org/10.1023/A:1009783824328
  7. D. Harrison and D. L. Rubinfeld, "Hedonic Housing Prices and the Demand for Clean Air," Journal of Environmental Economics and Management, vol.5, pp.81-102, 1978. https://doi.org/10.1016/0095-0696(78)90006-2