• 제목/요약/키워드: 의사결정나무분석

Search Result 114, Processing Time 0.189 seconds

통계적 분류방법을 이용한 문화재 정보 분석

  • Kang, Min-Gu;Sung, Su-Jin;Lee, Jin-Young;Na, Jong-Hwa
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • /
    • pp.120-125
    • /
    • 2009
  • 본 논문에서는 통계적 분류방법을 이용하여 문화재 자료의 분석을 수행하였다. 분류방법으로는 선형판별분석, 로지스틱회귀분석, 의사결정나무분석, 신경망분석, SVM분석을 사용하였다. 각각의 분류방법에 대한 개념 및 이론에 대해 간략히 소개하고, 실제자료 분석에서는 "지역별 문화재 통계분석 및 모형개발 연구 1차(2008)"에 사용된 자료 중 익산시 자료를 근거로 매장문화재에 대한 분류방법별 적합모형을 구축하였다. 구축된 모형과 모의실험의 결과를 통해 각각의 적합모형에 대한 비교를 수행하여 모형의 성능을 비교하였다. 분석에 사용된 도구로는 최근 가장 관심을 갖는 R-project를 사용하였다.

  • PDF

A study on the comparison of descriptive variables reduction methods in decision tree induction: A case of prediction models of pension insurance in life insurance company (생명보험사의 개인연금 보험예측 사례를 통해서 본 의사결정나무 분석의 설명변수 축소에 관한 비교 연구)

  • Lee, Yong-Goo;Hur, Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.1
    • /
    • pp.179-190
    • /
    • 2009
  • In the financial industry, the decision tree algorithm has been widely used for classification analysis. In this case one of the major difficulties is that there are so many explanatory variables to be considered for modeling. So we do need to find effective method for reducing the number of explanatory variables under condition that the modeling results are not affected seriously. In this research, we try to compare the various variable reducing methods and to find the best method based on the modeling accuracy for the tree algorithm. We applied the methods on the pension insurance of a insurance company for getting empirical results. As a result, we found that selecting variables by using the sensitivity analysis of neural network method is the most effective method for reducing the number of variables while keeping the accuracy.

  • PDF

Estimating the determinants of victory and defeat through analyzing records of Korean pro-basketball (한국남자프로농구 경기기록 분석을 통한 승패결정요인 추정: 2010-2011시즌, 2011-2012시즌 정규리그 기록 적용)

  • Kim, Sae-Hyung;Lee, Jun-Woo;Lee, Mi-Sook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.5
    • /
    • pp.993-1003
    • /
    • 2012
  • The purpose of this study was to estimate the determinants of victory and defeat through analyzing records of Korean men pro-basketball. Statistical models of victory and defeat were established by collecting present basketball records (2010-2011, 2011-2012 season). Korea Basketball League (KBL) informs records of every pro-basketball game data. The six offence variables (2P%, 3P%, FT%, OR, AS, TO), and the four defense variables (DR, ST, GD, BS) were used in this study. PASW program was used for logistic regression and Answer Tree program was used for the decision tree. All significance levels were set at .05. Major results were as follows. In the logistic regression, 2P%, 3P%, and TO were three offense variables significantly affecting victory and defeat, and DR, ST, and BS were three significant defense variables. Offensive variables 2P%, 3P%, TO, and AS are used in constructing the decision tree. The highest percentage of victory was 80.85% when 2P% was in 51%-58%, 3P% was more than 31 percent, and TO was less than 11 times. In the decision tree of the defence variables, the highest percentage of victory was 94.12% when DR was more than 24, ST was more than six, and BS was more than two times.

Classification Analysis for the Prediction of Underground Cultural Assets (매장문화재 예측을 위한 통계적 분류 분석)

  • Yu, Hye-Kyung;Lee, Jin-Young;Na, Jong-Hwa
    • Journal of the Korea Industrial Information Systems Research
    • /
    • v.14 no.3
    • /
    • pp.106-113
    • /
    • 2009
  • Various statistical classification methods have been used to establish prediction model of underground cultural assets in our country. Among them, linear discriminant analysis, logistic regression, decision tree, neural network, and support vector machines are used in this paper. We introduced the basic concepts of above-mentioned classification methods and applied these to the analyses of real data of I city. As a results, five different prediction models are suggested. And also model comparisons are executed by suggesting correct classification rates of the fitted models. To see the applicability of the suggested models for a new data set, simulations are carried out. R packages and programs are used in real data analyses and simulations. Especially, the detailed executing processes by R are provided for the other analyser of related area.

  • PDF

Comparative Analysis of Predictors of Depression for Residents in a Metropolitan City using Logistic Regression and Decision Making Tree (로지스틱 회귀분석과 의사결정나무 분석을 이용한 일 대도시 주민의 우울 예측요인 비교 연구)

  • Kim, Soo-Jin;Kim, Bo-Young
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.12
    • /
    • pp.829-839
    • /
    • 2013
  • This study is a descriptive research study with the purpose of predicting and comparing factors of depression affecting residents in a metropolitan city by using logistic regression analysis and decision-making tree analysis. The subjects for the study were 462 residents ($20{\leq}aged{\angle}65$) in a metropolitan city. This study collected data between October 7, 2011 and October 21, 2011 and analyzed them with frequency analysis, percentage, the mean and standard deviation, ${\chi}^2$-test, t-test, logistic regression analysis, roc curve, and a decision-making tree by using SPSS 18.0 program. The common predicting variables of depression in community residents were social dysfunction, perceived physical symptom, and family support. The specialty and sensitivity of logistic regression explained 93.8% and 42.5%. The receiver operating characteristic (roc) curve was used to determine an optimal model. The AUC (area under the curve) was .84. Roc curve was found to be statistically significant (p=<.001). The specialty and sensitivity of decision-making tree analysis were 98.3% and 20.8% respectively. As for the whole classification accuracy, the logistic regression explained 82.0% and the decision making tree analysis explained 80.5%. From the results of this study, it is believed that the sensitivity, the classification accuracy, and the logistics regression analysis as shown in a higher degree may be useful materials to establish a depression prediction model for the community residents.

A Study on Factors of Internet Overdependence for Adults Using the Decision Tree Analysis Model (성인층의 인터넷 과의존 영향요인: 의사결정나무분석을 활용하여)

  • Seo, Hyung-Jun;Shin, Ji-Woong
    • Informatization Policy
    • /
    • v.25 no.2
    • /
    • pp.20-45
    • /
    • 2018
  • This study aims to find the factors of Internet overdependence in adults, through the decision tree analysis model, which is a data mining method using National Information Society Agency's raw data from the survey on Internet overdependence in 2016. As a result of the decision tree analysis, a total 16 nodes of Internet overdependence risk groups were identified. The main predicated variables were the amount of time spent per smart media usage in weekdays; amount of time spent per smart media usage in weekends; experiences of purchasing cash items; percentage of using smart media for leisure; negative personality; percentage of using smart media for information search and utilization; and awareness on good functions of the Internet, all of which in order had greater impact on the risk groups. Users in the highest risk node spent the smart media for more than 5 minutes per use and less than 5~10 minutes in weekdays, had experiences of cash item purchase, and had lower level of awareness on the good functions of the Internet. The analysis led to the following recommendations: First, even a short-time use has higher chances of causing Internet overdependence, and therefore, guidelines need to be developed based on research on the usage behavior rather than the usage time. Second, self-regulation is required because factors that affect overindulgence in games, such as the cash items, increase Internet overdependence. Third, using the Internet for leisure causes higher risk of overdependence and therefore, other means of leisure should be recommended.

A verification of algorithm on resilience leisure programs for the productive aging of the new elderly in Korea (한국 신노년층의 생산적 노화를 위한 회복탄력형 여가 프로그램 알고리즘 검증)

  • Yi, Eun Surk;Hwang, Hee Jeong;Shim, Seung Koo;Cho, Gun Sang;Ahn, Chan Woo
    • Journal of Digital Convergence
    • /
    • v.15 no.5
    • /
    • pp.505-515
    • /
    • 2017
  • This study examines the verification of algorithm on resilience leisure programs for the productive aging of the new elderly in Korea. The subjects for this study were 525 new elderly who lived in metropolis, medium-sized cities and farming area. The reliability and validity test of the questionnaire were conducted by using SPSS 20.0 program; the results of tree analysis are as follows; First, The influential factor in the resilience leisure programs is subjective health status, desire for activity, interpersonal exchange and household income. The most influential resilience factor of algorithm is interpersonal relationship, self-regulating and affirmative. The structural algorithm of resilience was that low interpersonal relationship group related to the affirmative and high interpersonal relationship group related to the self-regulating.

Exploring predictors of subsequent childbirth plan for non-employed and employed mothers : The application of decision tree analysis (의사결정나무분석을 적용한 비취업모와 취업모의 후속출산계획 예측요인 탐색)

  • Lim, Yang-Mi
    • Journal of Korean Home Economics Education Association
    • /
    • v.27 no.4
    • /
    • pp.155-172
    • /
    • 2015
  • This study aimed to identify the effects of mothers' variables and present children's variables on subsequent childbirth plan and to explore predictors of subsequent childbirth plan for non-employed and employed mothers. The subjects were 1,635 mothers participating in the Panel Study on Korean Children from 2008 to 2010 and having no subsequent children until 2010 after giving birth to children in 2008. The data were analyzed with descriptive statistics, t test, ${\chi}^2$ test, and decision tree analysis. The main results of this study were as follows. Firstly, mothers' child-rearing stresses, child value, marital satisfaction, social support, present children's birth order and sex influenced mothers' subsequent childbirth plans, whereas mothers' average family income per month did not. Secondly, in the case of non-employed mothers, their present children's birth order and sex, and mothers' child value predicted their subsequent childbirth plan. Specifically, mothers whose present children's birth order and sex was first and female had the highest possibilities of subsequent childbirth plan, followed by mothers whose present children's birth order and sex was first and male, and child value was higher. Thirdly, in the case of employed mothers, their present children's birth order and mothers' marital satisfaction predicted their subsequent childbirth plan. Specifically, mothers whose present children' birth order was first and marital satisfaction was higher had the highest possibilities of subsequent childbirth plan. Finally, the study suggested the role of Home Economics Education in raising the rate of subsequent childbirth.

A Study on Decision Factors Affecting Utilization of Elderly Welfare Center: Focus on Gimpo City (노인복지관 이용 결정요인에 관한 연구: 김포시 노인을 중심으로)

  • Won, Il;Kim, Keunhong;Kim, SungHyun
    • 한국노년학
    • /
    • v.38 no.2
    • /
    • pp.351-364
    • /
    • 2018
  • The purpose of this study is to learn about the decision factors affecting utilization of elderly welfare center of the elderly living in Gimpo city. The reason of the study is that the elderly welfare center as a provider of general welfare services could not only thinking about the state policy but also need to consider about the inherent role and function of the elderly. Especially for these elders living in rural areas, although the number of elderly welfare centers of the whole country has greatly increased in last 10 years, the effect and function of the facility are almost the same and they are still lack of leisure activities. This issue become a serious problem nowadays. For the above reasons, this article conducts a social survey of 360 elderly people over the age of 65 who lives in the Gimpo city which is a rural-urban type city. The research method is to examine the relationship between the predisposing factors, enabling factors and need factors of Andersen's behavior model with binary logistic regression analysis and the decision tree analysis. The result of binary logistic regression shows the most of factors of Andersen's model is significant. The factors of age, gender, education level in predisposing factors; monthly income in enabling factors and the reserve for old life, the preparation of economic activity for old life in need factors are significant. Then the result of decision tree analysis shows the interaction between factors; when the education level in predisposing factors is higher, the possibility of using of elderly welfare center becomes bigger. Also as the level of healthy promoting preparation in the need factors gets lower, the possibility of using of elderly welfare center still becomes bigger. Although differences were found in the interpretation of the results of regression analysis and decision tree analysis, the results of this study can still provide support for the necessity of elderly welfare centers providing integrated welfare services.