DOI QR코드

DOI QR Code

Rule Selection Method in Decision Tree Models

의사결정나무 모델에서의 중요 룰 선택기법

  • Son, Jieun (School of Industrial Management Engineering, Korea University) ;
  • Kim, Seoung Bum (School of Industrial Management Engineering, Korea University)
  • 손지은 (고려대학교 산업경영공학과) ;
  • 김성범 (고려대학교 산업경영공학과)
  • Received : 2013.12.04
  • Accepted : 2014.04.29
  • Published : 2014.08.15

Abstract

Data mining is a process of discovering useful patterns or information from large amount of data. Decision tree is one of the data mining algorithms that can be used for both classification and prediction and has been widely used for various applications because of its flexibility and interpretability. Decision trees for classification generally generate a number of rules that belong to one of the predefined category and some rules may belong to the same category. In this case, it is necessary to determine the significance of each rule so as to provide the priority of the rule with users. The purpose of this paper is to propose a rule selection method in classification tree models that accommodate the umber of observation, accuracy, and effectiveness in each rule. Our experiments demonstrate that the proposed method produce better performance compared to other existing rule selection methods.

Keywords

References

  1. Agrawal, R., Imielinski, T., and Swami, A. (1993), Mining association rules between sets of items in large databases, In ACM SIGMOD Record, 22(2), 207-216. https://doi.org/10.1145/170036.170072
  2. Agrawal, R. and Srikant, R. (1994), Fast algorithms for mining association rules, In Proc. 20th Int. Conf. Very Large Data Bases, VLDB, 1215, 487-499.
  3. Bose, I. and Mahapatra, R. K. (2001), Business data mining-a machine learning perspective, Information and management, 39(3), 211-225. https://doi.org/10.1016/S0378-7206(01)00091-X
  4. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1984), In Classification and regression trees Belmont, CA : Wadsworth International Group.
  5. Clark, P. and Niblett, T. (1989), The CN2 induction algorithm, Machine learning, 3(4), 261-283.
  6. Coenen, F. and Leng, P. (2004), An evaluation of approaches to classification rule selection, In Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on, 359-362.
  7. Dong, G., Zhang, X., Wong, L., and Li, J. (1999), CAEP : Classification by aggregating emerging patterns, In Discovery Science, 30-42.
  8. Guo, Z., Singh, R., and Pierce, M. (2009), Building the Polar Grad Portal Using Web 2.0 and Open Social, Pervasive Technology Institute Indiana University, Bloomington, Indiana.
  9. Han, J. (2003), CPAR : Classification based on predictive association rules, In Proceedings of the third SIAM international conference on data mining, 3, 331-335.
  10. Lavrac, N., Flach, P., and Zupan, B. (1999), Rule evaluation measures : A unifying view, 174-185.
  11. Li, W., Han, J., and Pei, J. (2001), CMAR: Accurate and efficient classification based on multiple class-association rules, In Data Mining, 2001, ICDM 2001, Proceedings IEEE International Conference on, 369-376.
  12. Liu, B., Hsu, W., and Ma, Y. (1998), Integrating classification and association rule mining, In Proceedings of the 4th.
  13. Mitchell, T. M. (1997), Machine Learning, 52-78, Singapore, The McGraw-Hill Companies Inc..
  14. Quinlan, J. R. (1993), C4. 5 : Programs for machine learning, Morgan Kaufmann.
  15. Safavian, S. R. and Landgrebe, D. (1991), Asurvey of decision tree classifier methodology, Systems, Man and Cybernetics, IEEE Transactions on, 21(3), 660-674. https://doi.org/10.1109/21.97458
  16. Shumeli, G., Patel, N. R., and Bruce, P. C. (2010), Data Mining for Business Intelligence, 2nd ed, WILEY, Canada, 3-38.
  17. Wang, Y. J., Xin, Q., and Coenen, F. (2007), A novel rule ordering approach in classification association rule mining, In Machine Learning and Data Mining in Pattern Recognition, 339-348.
  18. Wang, K., Zhou, S., and He, Y. (2000), Growing decision trees on support-less association rules, In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 265-269.