• Title/Summary/Keyword: decision tree

Search Result 1,612, Processing Time 0.03 seconds

A review of tree-based Bayesian methods

  • Linero, Antonio R.
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.543-559
    • /
    • 2017
  • Tree-based regression and classification ensembles form a standard part of the data-science toolkit. Many commonly used methods take an algorithmic view, proposing greedy methods for constructing decision trees; examples include the classification and regression trees algorithm, boosted decision trees, and random forests. Recent history has seen a surge of interest in Bayesian techniques for constructing decision tree ensembles, with these methods frequently outperforming their algorithmic counterparts. The goal of this article is to survey the landscape surrounding Bayesian decision tree methods, and to discuss recent modeling and computational developments. We provide connections between Bayesian tree-based methods and existing machine learning techniques, and outline several recent theoretical developments establishing frequentist consistency and rates of convergence for the posterior distribution. The methodology we present is applicable for a wide variety of statistical tasks including regression, classification, modeling of count data, and many others. We illustrate the methodology on both simulated and real datasets.

A methodology for Internet Customer segmentation using Decision Trees

  • Cho, Y.B.;Kim, S.H.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2003.05a
    • /
    • pp.206-213
    • /
    • 2003
  • Application of existing decision tree algorithms for Internet retail customer classification is apt to construct a bushy tree due to imprecise source data. Even excessive analysis may not guarantee the effectiveness of the business although the results are derived from fully detailed segments. Thus, it is necessary to determine the appropriate number of segments with a certain level of abstraction. In this study, we developed a stopping rule that considers the total amount of information gained while generating a rule tree. In addition to forwarding from root to intermediate nodes with a certain level of abstraction, the decision tree is investigated by the backtracking pruning method with misclassification loss information.

  • PDF

Investment, Export, and Exchange Rate on Prediction of Employment with Decision Tree, Random Forest, and Gradient Boosting Machine Learning Models (투자와 수출 및 환율의 고용에 대한 의사결정 나무, 랜덤 포레스트와 그래디언트 부스팅 머신러닝 모형 예측)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.46 no.2
    • /
    • pp.281-299
    • /
    • 2021
  • This paper analyzes the feasibility of using machine learning methods to forecast the employment. The machine learning methods, such as decision tree, artificial neural network, and ensemble models such as random forest and gradient boosting regression tree were used to forecast the employment in Busan regional economy. The following were the main findings of the comparison of their predictive abilities. First, the forecasting power of machine learning methods can predict the employment well. Second, the forecasting values for the employment by decision tree models appeared somewhat differently according to the depth of decision trees. Third, the predictive power of artificial neural network model, however, does not show the high predictive power. Fourth, the ensemble models such as random forest and gradient boosting regression tree model show the higher predictive power. Thus, since the machine learning method can accurately predict the employment, we need to improve the accuracy of forecasting employment with the use of machine learning methods.

Decision Tree based Scheduling for Static and Dynamic Flexible Job Shops with Multiple Process Plans (다중 공정계획을 가지는 정적/동적 유연 개별공정에 대한 의사결정 나무 기반 스케줄링)

  • Yu, Jae-Min;Doh, Hyoung-Ho;Kwon, Yong-Ju;Shin, Jeong-Hoon;Kim, Hyung-Won;Nam, Sung-Ho;Lee, Dong-Ho
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.32 no.1
    • /
    • pp.25-37
    • /
    • 2015
  • This paper suggests a decision tree based approach for flexible job shop scheduling with multiple process plans. The problem is to determine the operation/machine pairs and the sequence of the jobs assigned to each machine. Two decision tree based scheduling mechanisms are developed for static and dynamic flexible job shops. In the static case, all jobs are given in advance and the decision tree is used to select a priority dispatching rule to process all the jobs. Also, in the dynamic case, the jobs arrive over time and the decision tree, updated regularly, is used to select a priority rule in real-time according to a rescheduling strategy. The two decision tree based mechanisms were applied to a flexible job shop case with reconfigurable manufacturing cells and a conventional job shop, and the results are reported for various system performance measures.

A study on decision tree creation using intervening variable (매개 변수를 이용한 의사결정나무 생성에 관한 연구)

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.671-678
    • /
    • 2011
  • Data mining searches for interesting relationships among items in a given database. The methods of data mining are decision tree, association rules, clustering, neural network and so on. The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, customer classification, etc. When create decision tree model, complicated model by standard of model creation and number of input variable is produced. Specially, there is difficulty in model creation and analysis in case of there are a lot of numbers of input variable. In this study, we study on decision tree using intervening variable. We apply to actuality data to suggest method that remove unnecessary input variable for created model and search the efficiency.

NPC Control Model for Defense in Soccer Game Applying the Decision Tree Learning Algorithm (결정트리 학습 알고리즘을 활용한 축구 게임 수비 NPC 제어 방법)

  • Cho, Dal-Ho;Lee, Yong-Ho;Kim, Jin-Hyung;Park, So-Young;Rhee, Dae-Woong
    • Journal of Korea Game Society
    • /
    • v.11 no.6
    • /
    • pp.61-70
    • /
    • 2011
  • In this paper, we propose a defense NPC control model in the soccer game by applying the Decision Tree learning algorithm. The proposed model extracts the direction patterns and the action patterns generated by many soccer game users, and applies these patterns to the Decision Tree learning algorithm. Then, the proposed model decides the direction and the action according to the learned Decision Tree. Experimental results show that the proposed model takes some time to learn the Decision Tree while the proposed model takes 0.001-0.003 milliseconds to decide the direction and the action based on the learned Decision Tree. Therefore, the proposed model can control NPC in the soccer game system in real time. Also, the proposed model achieves higher accuracy than a previous model (Letia98); because the proposed model can utilize current state information, its analyzed information, and previous state information.

Decision Tree Techniques with Feature Reduction for Network Anomaly Detection (네트워크 비정상 탐지를 위한 속성 축소를 반영한 의사결정나무 기술)

  • Kang, Koohong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.4
    • /
    • pp.795-805
    • /
    • 2019
  • Recently, there is a growing interest in network anomaly detection technology to tackle unknown attacks. For this purpose, diverse studies using data mining, machine learning, and deep learning have been applied to detect network anomalies. In this paper, we evaluate the decision tree to see its feasibility for network anomaly detection on NSL-KDD data set, which is one of the most popular data mining techniques for classification. In order to handle the over-fitting problem of decision tree, we select 13 features from the original 41 features of the data set using chi-square test, and then model the decision tree using TensorFlow and Scik-Learn, yielding 84% and 70% of binary classification accuracies on the KDDTest+ and KDDTest-21 of NSL-KDD test data set. This result shows 3% and 6% improvements compared to the previous 81% and 64% of binary classification accuracies by decision tree technologies, respectively.

Urban Sprawl prediction in 2030 using decision tree (의사결정나무를 활용한 2030년 도시 확장 예측)

  • Kim, Geun-Han;Choi, Hee-Sun;Kim, Dong-Beom;Jung, Yee-Rim;Jin, Dae-Yong
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.23 no.6
    • /
    • pp.125-135
    • /
    • 2020
  • The uncontrolled urban expansion causes various social, economic problems and natural/environmental problems. Therefore, it is necessary to forecast urban expansion by identifying various factors related to urban expansion. This study aims to forecast it using a decision tree that is widely used in various areas. The study used geographic data such as the area of use, geographical data like elevation and slope, the environmental conservation value assessment map, and population density data for 2006 and 2018. It extracted the new urban expansion areas by comparing the residential, industrial, and commercial zones of the zoning in 2006 and 2018 and derived a decision tree using the 2006 data as independent variables. It is intended to forecast urban expansion in 2030 by applying the data for 2018 to the derived decision tree. The analysis result confirmed that the distance from the green area, the elevation, the grade of the environmental conservation value assessment map, and the distance from the industrial area were important factors in forecasting the urban area expansion. The AUC of 0.95051 showed excellent explanatory power in the ROC analysis performed to verify the accuracy. However, the forecast of the urban area expansion for 2018 using the decision tree was 15,459.98㎢, which was significantly different from the actual urban area of 4,144.93㎢ for 2018. Since many regions use decision tree to forecast urban expansion, they can be useful for identifying which factors affect urban expansion, although they are not suitable for forecasting the expansion of urban region in detail. Identifying such important factors for urban expansion is expected to provide information that can be used in future land, urban, and environmental planning.

Prediction of the number of public bicycle rental in Seoul using Boosted Decision Tree Regression Algorithm

  • KIM, Hyun-Jun;KIM, Hyun-Ki
    • Korean Journal of Artificial Intelligence
    • /
    • v.10 no.1
    • /
    • pp.9-14
    • /
    • 2022
  • The demand for public bicycles operated by the Seoul Metropolitan Government is increasing every year. The size of the Seoul public bicycle project, which first started with about 5,600 units, increased to 3,7500 units as of September 2021, and the number of members is also increasing every year. However, as the size of the project grows, excessive budget spending and deficit problems are emerging for public bicycle projects, and new bicycles, rental office costs, and bicycle maintenance costs are blamed for the deficit. In this paper, the Azure Machine Learning Studio program and the Boosted Decision Tree Regression technique are used to predict the number of public bicycle rental over environmental factors and time. Predicted results it was confirmed that the demand for public bicycles was high in the season except for winter, and the demand for public bicycles was the highest at 6 p.m. In addition, in this paper compare four additional regression algorithms in addition to the Boosted Decision Tree Regression algorithm to measure algorithm performance. The results showed high accuracy in the order of the First Boosted Decision Tree Regression Algorithm (0.878802), second Decision Forest Regression (0.838232), third Poison Regression (0.62699), and fourth Linear Regression (0.618773). Based on these predictions, it is expected that more public bicycles will be placed at rental stations near public transportation to meet the growing demand for commuting hours and that more bicycles will be placed in rental stations in summer than winter and the life of bicycles can be extended in winter.

Adopting and Implementation of Decision Tree Classification Method for Image Interpolation (이미지 보간을 위한 의사결정나무 분류 기법의 적용 및 구현)

  • Kim, Donghyung
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.16 no.1
    • /
    • pp.55-65
    • /
    • 2020
  • With the development of display hardware, image interpolation techniques have been used in various fields such as image zooming and medical imaging. Traditional image interpolation methods, such as bi-linear interpolation, bi-cubic interpolation and edge direction-based interpolation, perform interpolation in the spatial domain. Recently, interpolation techniques in the discrete cosine transform or wavelet domain are also proposed. Using these various existing interpolation methods and machine learning, we propose decision tree classification-based image interpolation methods. In other words, this paper is about the method of adaptively applying various existing interpolation methods, not the interpolation method itself. To obtain the decision model, we used Weka's J48 library with the C4.5 decision tree algorithm. The proposed method first constructs attribute set and select classes that means interpolation methods for classification model. And after training, interpolation is performed using different interpolation methods according to attributes characteristics. Simulation results show that the proposed method yields reasonable performance.