• 제목/요약/키워드: decision tree

검색결과 1,613건 처리시간 0.028초

A review of tree-based Bayesian methods

  • Linero, Antonio R.
    • Communications for Statistical Applications and Methods
    • /
    • 제24권6호
    • /
    • pp.543-559
    • /
    • 2017
  • Tree-based regression and classification ensembles form a standard part of the data-science toolkit. Many commonly used methods take an algorithmic view, proposing greedy methods for constructing decision trees; examples include the classification and regression trees algorithm, boosted decision trees, and random forests. Recent history has seen a surge of interest in Bayesian techniques for constructing decision tree ensembles, with these methods frequently outperforming their algorithmic counterparts. The goal of this article is to survey the landscape surrounding Bayesian decision tree methods, and to discuss recent modeling and computational developments. We provide connections between Bayesian tree-based methods and existing machine learning techniques, and outline several recent theoretical developments establishing frequentist consistency and rates of convergence for the posterior distribution. The methodology we present is applicable for a wide variety of statistical tasks including regression, classification, modeling of count data, and many others. We illustrate the methodology on both simulated and real datasets.

A methodology for Internet Customer segmentation using Decision Trees

  • Cho, Y.B.;Kim, S.H.
    • 한국지능정보시스템학회:학술대회논문집
    • /
    • 한국지능정보시스템학회 2003년도 춘계학술대회
    • /
    • pp.206-213
    • /
    • 2003
  • Application of existing decision tree algorithms for Internet retail customer classification is apt to construct a bushy tree due to imprecise source data. Even excessive analysis may not guarantee the effectiveness of the business although the results are derived from fully detailed segments. Thus, it is necessary to determine the appropriate number of segments with a certain level of abstraction. In this study, we developed a stopping rule that considers the total amount of information gained while generating a rule tree. In addition to forwarding from root to intermediate nodes with a certain level of abstraction, the decision tree is investigated by the backtracking pruning method with misclassification loss information.

  • PDF

투자와 수출 및 환율의 고용에 대한 의사결정 나무, 랜덤 포레스트와 그래디언트 부스팅 머신러닝 모형 예측 (Investment, Export, and Exchange Rate on Prediction of Employment with Decision Tree, Random Forest, and Gradient Boosting Machine Learning Models)

  • 이재득
    • 무역학회지
    • /
    • 제46권2호
    • /
    • pp.281-299
    • /
    • 2021
  • This paper analyzes the feasibility of using machine learning methods to forecast the employment. The machine learning methods, such as decision tree, artificial neural network, and ensemble models such as random forest and gradient boosting regression tree were used to forecast the employment in Busan regional economy. The following were the main findings of the comparison of their predictive abilities. First, the forecasting power of machine learning methods can predict the employment well. Second, the forecasting values for the employment by decision tree models appeared somewhat differently according to the depth of decision trees. Third, the predictive power of artificial neural network model, however, does not show the high predictive power. Fourth, the ensemble models such as random forest and gradient boosting regression tree model show the higher predictive power. Thus, since the machine learning method can accurately predict the employment, we need to improve the accuracy of forecasting employment with the use of machine learning methods.

다중 공정계획을 가지는 정적/동적 유연 개별공정에 대한 의사결정 나무 기반 스케줄링 (Decision Tree based Scheduling for Static and Dynamic Flexible Job Shops with Multiple Process Plans)

  • 유재민;도형호;권용주;신정훈;김형원;남성호;이동호
    • 한국정밀공학회지
    • /
    • 제32권1호
    • /
    • pp.25-37
    • /
    • 2015
  • This paper suggests a decision tree based approach for flexible job shop scheduling with multiple process plans. The problem is to determine the operation/machine pairs and the sequence of the jobs assigned to each machine. Two decision tree based scheduling mechanisms are developed for static and dynamic flexible job shops. In the static case, all jobs are given in advance and the decision tree is used to select a priority dispatching rule to process all the jobs. Also, in the dynamic case, the jobs arrive over time and the decision tree, updated regularly, is used to select a priority rule in real-time according to a rescheduling strategy. The two decision tree based mechanisms were applied to a flexible job shop case with reconfigurable manufacturing cells and a conventional job shop, and the results are reported for various system performance measures.

매개 변수를 이용한 의사결정나무 생성에 관한 연구 (A study on decision tree creation using intervening variable)

  • 조광현;박희창
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권4호
    • /
    • pp.671-678
    • /
    • 2011
  • 데이터마이닝은 방대한 양의 데이터 속에서 쉽게 드러나지 않는 유용한 정보를 찾아내는 기법으로서 의사결정나무, 연관 규칙, 군집분석, 신경망 분석 등의 기법이 있으며, 이중 의사결정나무 알고리즘은 의사결정 규칙을 도표화하여 관심대상이 되는 집단을 몇 개의 소집단으로 분류하거나 예측을 수행하는 방법으로서 고객세분화, 고객 분류, 문제 예측 등의 여러 분야에서 유용하게 활용되고 있다. 일반적으로 의사결정나무의 모형 생성 시, 모형 생성의 기준 및 입력 변수의 수에 따라 복잡한 모형이 생성되기도 하며 특히 입력 변수의 수가 많을 경우 종종 모형 생성 및 해석에 있어 어려움을 격기도 한다. 이에 본 논문에서는 의사결정나무 생성 시, 입력 변수에 대한 매개 관계를 파악하여 나무 생성에 불필요한 입력 변수를 제거하는 방법을 제시하고 그 효율성을 파악하기 위하여 실제 자료에 적용하고자 한다.

결정트리 학습 알고리즘을 활용한 축구 게임 수비 NPC 제어 방법 (NPC Control Model for Defense in Soccer Game Applying the Decision Tree Learning Algorithm)

  • 조달호;이용호;김진형;박소영;이대웅
    • 한국게임학회 논문지
    • /
    • 제11권6호
    • /
    • pp.61-70
    • /
    • 2011
  • 본 논문에서는 결정트리 학습 알고리즘을 활용한 축구 게임 수비 NPC 제어 방법을 제안한다. 제안하는 방법은 실제 게임 사용자들의 이동 방향 패턴과 행동 패턴을 추출하여 결정트리학습 알고리즘에 적용한다. 그리고 학습된 결정트리를 바탕으로 NPC의 이동방향과 행동을 결정한다. 실험결과 제안하는 방법은 결정트리 학습에 시간이 다소 걸리지만, 학습된 결정트리를 바탕으로 이동방향이나 행동을 결정하는 시간은 약 0.001-0.003 ms(밀리초)가 소요되어 실시간으로 NPC를 제어할 수 있었다. 또한, 제안하는 방법은 현재 상태 정보 뿐만 아니라 이를 분석한 관계정보, 이전 상태 정보도 함께 활용하므로, 기존방법인 (Letia98)에 비해 이동방향 결정시 높은 정확도를 나타냈다.

네트워크 비정상 탐지를 위한 속성 축소를 반영한 의사결정나무 기술 (Decision Tree Techniques with Feature Reduction for Network Anomaly Detection)

  • 강구홍
    • 정보보호학회논문지
    • /
    • 제29권4호
    • /
    • pp.795-805
    • /
    • 2019
  • 최근 알려지지 않은 공격에 대처하기 위한 네트워크 비정상(anomaly) 탐지 기술에 대한 관심이 한층 높아지고 있다. 이러한 기술 개발을 위해 데이터 마이닝(data mining), 기계학습(machine learning), 그리고 딥러닝(deep learning)등을 활용한 다양한 연구가 진행되고 있다. 본 논문에서는 분류(classification) 문제를 다루는 데이터 마이닝 기술 중 가장 전통적인 방법 중 하나인 의사결정나무(decision tree)를 이용하여 NSL-KDD 데이터 셋을 대상으로 네트워크 비정상 탐지 가능성을 보여준다. 의사결정나무의 과대적합(over-fitting) 단점을 해소하기 위해 카이-제곱(chi-square) 테스트를 통해 최적의 속성 선택(feature selection)을 수행하고, 선택된 13개의 속성을 사용한 의사결정나무 모델 환경에서 NSL-KDD 시험 데이터 셋 KDDTest+에 대해 84% 그리고 KDDTest-21에 대해 70%의 네트워크 비정상 검출 정확도를 보였다. 제시된 정확도는 기존 의사결정나무 모델 적용 시 이들 시험 데이터 셋을 대상으로 알려진 정확도 81% 그리고 64% 수준과 비교해 약 3% 그리고 6% 각각 향상된 결과다.

의사결정나무를 활용한 2030년 도시 확장 예측 (Urban Sprawl prediction in 2030 using decision tree)

  • 김근한;최희선;김동범;정예림;진대용
    • 한국환경복원기술학회지
    • /
    • 제23권6호
    • /
    • pp.125-135
    • /
    • 2020
  • The uncontrolled urban expansion causes various social, economic problems and natural/environmental problems. Therefore, it is necessary to forecast urban expansion by identifying various factors related to urban expansion. This study aims to forecast it using a decision tree that is widely used in various areas. The study used geographic data such as the area of use, geographical data like elevation and slope, the environmental conservation value assessment map, and population density data for 2006 and 2018. It extracted the new urban expansion areas by comparing the residential, industrial, and commercial zones of the zoning in 2006 and 2018 and derived a decision tree using the 2006 data as independent variables. It is intended to forecast urban expansion in 2030 by applying the data for 2018 to the derived decision tree. The analysis result confirmed that the distance from the green area, the elevation, the grade of the environmental conservation value assessment map, and the distance from the industrial area were important factors in forecasting the urban area expansion. The AUC of 0.95051 showed excellent explanatory power in the ROC analysis performed to verify the accuracy. However, the forecast of the urban area expansion for 2018 using the decision tree was 15,459.98㎢, which was significantly different from the actual urban area of 4,144.93㎢ for 2018. Since many regions use decision tree to forecast urban expansion, they can be useful for identifying which factors affect urban expansion, although they are not suitable for forecasting the expansion of urban region in detail. Identifying such important factors for urban expansion is expected to provide information that can be used in future land, urban, and environmental planning.

Prediction of the number of public bicycle rental in Seoul using Boosted Decision Tree Regression Algorithm

  • KIM, Hyun-Jun;KIM, Hyun-Ki
    • 한국인공지능학회지
    • /
    • 제10권1호
    • /
    • pp.9-14
    • /
    • 2022
  • The demand for public bicycles operated by the Seoul Metropolitan Government is increasing every year. The size of the Seoul public bicycle project, which first started with about 5,600 units, increased to 3,7500 units as of September 2021, and the number of members is also increasing every year. However, as the size of the project grows, excessive budget spending and deficit problems are emerging for public bicycle projects, and new bicycles, rental office costs, and bicycle maintenance costs are blamed for the deficit. In this paper, the Azure Machine Learning Studio program and the Boosted Decision Tree Regression technique are used to predict the number of public bicycle rental over environmental factors and time. Predicted results it was confirmed that the demand for public bicycles was high in the season except for winter, and the demand for public bicycles was the highest at 6 p.m. In addition, in this paper compare four additional regression algorithms in addition to the Boosted Decision Tree Regression algorithm to measure algorithm performance. The results showed high accuracy in the order of the First Boosted Decision Tree Regression Algorithm (0.878802), second Decision Forest Regression (0.838232), third Poison Regression (0.62699), and fourth Linear Regression (0.618773). Based on these predictions, it is expected that more public bicycles will be placed at rental stations near public transportation to meet the growing demand for commuting hours and that more bicycles will be placed in rental stations in summer than winter and the life of bicycles can be extended in winter.

이미지 보간을 위한 의사결정나무 분류 기법의 적용 및 구현 (Adopting and Implementation of Decision Tree Classification Method for Image Interpolation)

  • 김동형
    • 디지털산업정보학회논문지
    • /
    • 제16권1호
    • /
    • pp.55-65
    • /
    • 2020
  • With the development of display hardware, image interpolation techniques have been used in various fields such as image zooming and medical imaging. Traditional image interpolation methods, such as bi-linear interpolation, bi-cubic interpolation and edge direction-based interpolation, perform interpolation in the spatial domain. Recently, interpolation techniques in the discrete cosine transform or wavelet domain are also proposed. Using these various existing interpolation methods and machine learning, we propose decision tree classification-based image interpolation methods. In other words, this paper is about the method of adaptively applying various existing interpolation methods, not the interpolation method itself. To obtain the decision model, we used Weka's J48 library with the C4.5 decision tree algorithm. The proposed method first constructs attribute set and select classes that means interpolation methods for classification model. And after training, interpolation is performed using different interpolation methods according to attributes characteristics. Simulation results show that the proposed method yields reasonable performance.