A study on decision tree creation using marginally conditional variables

Cho, Kwang-Hyun;Park, Hee-Chang;

doi:10.7465/jkdi.2012.23.2.299

Journal of the Korean Data and Information Science Society

Volume 23 Issue 2
/
Pages.299-307
/
2012
/
1598-9402(pISSN)

The Korean Data and Information Science Society (한국데이터정보과학회)

DOI QR Code

A study on decision tree creation using marginally conditional variables

주변조건부 변수를 이용한 의사결정나무모형 생성에 관한 연구

Cho, Kwang-Hyun (Department of Early Childhood Education, Changwon National University) ;
Park, Hee-Chang (Department of Statistics, Changwon National University)

조광현 (창원대학교 유아교육학과) ;
박희창 (창원대학교 통계학과)

Received : 2012.02.08
Accepted : 2012.03.16
Published : 2012.03.31

https://doi.org/10.7465/jkdi.2012.23.2.299 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Data mining is a method of searching for an interesting relationship among items in a given database. The decision tree is a typical algorithm of data mining. The decision tree is the method that classifies or predicts a group as some subgroups. In general, when researchers create a decision tree model, the generated model can be complicated by the standard of model creation and the number of input variables. In particular, if the decision trees have a large number of input variables in a model, the generated models can be complex and difficult to analyze model. When creating the decision tree model, if there are marginally conditional variables (intervening variables, external variables) in the input variables, it is not directly relevant. In this study, we suggest the method of creating a decision tree using marginally conditional variables and apply to actual data to search for efficiency.

데이터마이닝은 주어진 데이터베이스에서 항목간의 흥미로운 관계를 찾아내는 기법으로서 의사결정나무는 데이터마이닝의 대표적인 알고리즘이라고 할 수 있다. 의사결정나무는 관심대상이 되는 집단을 몇 개의 소집단으로 분류하거나 예측을 수행하는 방법이다. 일반적으로 연구자가 의사결정나무 모형을 생성 할 때 모형 생성의 기준 및 입력 변수의 수에 따라 복잡한 모형이 생성되기도 한다. 특히 의사결정나무 모형에서 입력 변수의 수가 많을 경우 생성된 모형은 복잡한 형태가 될 수 있고, 모형 분석이 어려울 수도 있다. 만일 입력변수에서 주변조건부 변수 (매개변수, 외적변수)가 존재한다면 이 입력변수는 직접적인 관련성이 없는 것으로 판단한다. 이에 본 논문에서는 주변조건부 변수를 고려하여 의사결정나무모형을 생성하는 방법을 제시하고 그 효율성을 파악하기 위하여 실제 자료에 적용하고자 한다.

Keywords

References

Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and regression trees, Wadsworth and books, California.
Cho, K. H. and Park, H. C. (2011a). A study on insignificant rules discovery in association rule mining. Journal of the Korean Data & Information Science Society, 22, 81-88.
Cho, K. H. and Park, H. C. (2011b). A study on decision tree creation using intervening variable. Journal of the Korean Data & Information Science Society, 22, 671-678.
Cho, K. H. and Park, H. C. (2011c). A study on removal of unnecessary input variables using multiple external association rule. Journal of the Korean Data & Information Science Society, 22, 877-884.
Cho, K. H. and Park, H. C. (2011d). Discovery of insignificant association rules using external variable. Journal of the Korean Data Analysis Society, 13, 1343-1352.
Hartigan, J. A. (1975). Clustering algorithms, John Wiley & Sons, New York.
Park, H. C. (2010). Association rule ranking function by decreased lift influence. Journal of the Korean Data & Information Science Society, 21, 397-405.
Quinlan, J. R. (1993). C4.5 programs for machine learning, Morgan Kaufmann Publishers, San Francisco.

Cited by

Determinants of student course evaluation using hierarchical linear model vol.24, pp.6, 2013, https://doi.org/10.7465/jkdi.2013.24.6.1285
Usage of auxiliary variable and neural network in doubly robust estimation vol.24, pp.3, 2013, https://doi.org/10.7465/jkdi.2013.24.3.659
Analysis of employee's characteristic using data visualization vol.25, pp.4, 2014, https://doi.org/10.7465/jkdi.2014.25.4.727
A study on 3-step complex data mining in society indicator survey vol.23, pp.5, 2012, https://doi.org/10.7465/jkdi.2012.23.5.983
The study on the determinants of the number of job changes vol.26, pp.2, 2015, https://doi.org/10.7465/jkdi.2015.26.2.387
Major gene interactions effect identification on the quality of Hanwoo by radial graph vol.24, pp.1, 2013, https://doi.org/10.7465/jkdi.2013.24.1.151

Journal of the Korean Data and Information Science Society

A study on decision tree creation using marginally conditional variables

주변조건부 변수를 이용한 의사결정나무모형 생성에 관한 연구

Abstract

Keywords

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)