JOURNAL BROWSE
Search
Advanced SearchSearch Tips
A Study on Unbiased Methods in Constructing Classification Trees
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
A Study on Unbiased Methods in Constructing Classification Trees
Lee, Yoon-Mo; Song, Moon Sup;
  PDF(new window)
 Abstract
we propose two methods which separate the variable selection step and the split-point selection step. We call these two algorithms as CHITES method and F&CHITES method. They adapted some of the best characteristics of CART, CHAID, and QUEST. In the first step the variable, which is most significant to predict the target class values, is selected. In the second step, the exhaustive search method is applied to find the splitting point based on the selected variable in the first step. We compared the proposed methods, CART, and QUEST in terms of variable selection bias and power, error rates, and training times. The proposed methods are not only unbiased in the null case, but also powerful for selecting correct variables in non-null cases.
 Keywords
variable selection bias;variable selection power;exhaustive search method;
 Language
English
 Cited by
1.
변수선택 편향이 없는 회귀나무를 만들기 위한 알고리즘,김진흠;김민호;

응용통계연구, 2004. vol.17. 3, pp.459-473 crossref(new window)
2.
Bias Reduction in Split Variable Selection in C4.5,;;;

Communications for Statistical Applications and Methods, 2003. vol.10. 3, pp.627-635 crossref(new window)
3.
Input Variable Importance in Supervised Learning Models,;;

Communications for Statistical Applications and Methods, 2003. vol.10. 1, pp.239-246 crossref(new window)
4.
혼합형 데이터에 대한 나무형 군집화,양경숙;허명회;

응용통계연구, 2006. vol.19. 2, pp.271-282 crossref(new window)
1.
Bias Reduction in Split Variable Selection in C4.5, Communications for Statistical Applications and Methods, 2003, 10, 3, 627  crossref(new windwow)
2.
An unbiased method for constructing multilabel classification trees, Computational Statistics & Data Analysis, 2004, 47, 1, 149  crossref(new windwow)
3.
Input Variable Importance in Supervised Learning Models, Communications for Statistical Applications and Methods, 2003, 10, 1, 239  crossref(new windwow)
 References
1.
UCI repository of machine learning databases, 1998.

2.
Classification and Regression Trees, 1984.

3.
Applied Statistics, 1980. vol.29. pp.119-127 crossref(new window)

4.
Journal of the American statistical Association, 2001. vol.96. pp.589-604 crossref(new window)

5.
Ph.D. Thesis, 2002.

6.
Statistica Sinica, 1997. vol.7. pp.815-840

7.
Journal of the American Statistical Association, 1988. vol.83. pp.715-728 crossref(new window)

8.
C4.5 : Programs for Machine Learning, 1993.

9.
Proceeding of the Tenth Japan and Korea Joint Conference of Statistics, 2000. pp.125-130