A Study on the Bias Reduction in Split Variable Selection in CART

  • Published : 2004.12.01


In this short communication we discuss the bias problems of CART in split variable selection and suggest a method to reduce the variable selection bias. Penalties proportional to the number of categories or distinct values are applied to the splitting criteria of CART. The results of empirical comparisons show that the proposed modification of CART reduces the bias in variable selection.


  1. 송문섭, 윤영주 (2001), 데이터마이닝 패키지에서 변수선택 편의에 관한연구, [응용통계연구], 제14권, 475-486
  2. 정성석, 김순영, 임한필 (2004), 의사결정나무에서 분리 변수 선택에 관한 연구, [응용통계연구], 제17권, 347-357
  3. Blake, C.L. and Merz, C.J. (1998), UCI repository of machine learning databases (http://www.ics.uc/mleaurnlearn/~MLRepository.html), University of California, Department of Information and Computer Science, Irvine, CA
  4. Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. (1984), Classification and Regression Trees, Chapman and Hall, New York
  5. Dobra, A. and Gehrke, J. (2001), Bias correction in classification tree construction, Proceedings of the Seventeenth International Conference on Machine Learning, 90-97
  6. Kim, H. and Loh, W.Y. (2001), Classification trees with unbiased multiway splits, Journal