• Title/Summary/Keyword: Predictive models

Search Result 960, Processing Time 0.021 seconds

Ensemble approach for improving prediction in kernel regression and classification

  • Han, Sunwoo;Hwang, Seongyun;Lee, Seokho
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.4
    • /
    • pp.355-362
    • /
    • 2016
  • Ensemble methods often help increase prediction ability in various predictive models by combining multiple weak learners and reducing the variability of the final predictive model. In this work, we demonstrate that ensemble methods also enhance the accuracy of prediction under kernel ridge regression and kernel logistic regression classification. Here we apply bagging and random forests to two kernel-based predictive models; and present the procedure of how bagging and random forests can be embedded in kernel-based predictive models. Our proposals are tested under numerous synthetic and real datasets; subsequently, they are compared with plain kernel-based predictive models and their subsampling approach. Numerical studies demonstrate that ensemble approach outperforms plain kernel-based predictive models.

Empirical Analysis of 3 Statistical Models of Hospital Bankruptcy in Korea (병원도산 예측모형의 실증적 비교연구)

  • 이무식;서영준;양동현
    • Health Policy and Management
    • /
    • v.9 no.2
    • /
    • pp.1-20
    • /
    • 1999
  • This study was conducted to investigate the predictors of hospital bankruptcy in Korea and to examine the predictive power for 3 types of statistical models of hospital bankruptcy. Data on 17 financial and 4 non-financial indicators of 30 bankrupt and 30 profitable hospitals in 1. 2, and 3 years before bankruptcy were obtained from the hospital performance databank of Korea Institute of Health Services Management. Significant variables were identified through mean comparison of each indicator between bankrupt and profitable hospitals, and the predictive power of statistical models of hospital bankruptcy were compared. The major findings are as follows. 1. Nine out of 21 indicators - fixed ratio, quick ratio, operating profit to total assets, operating profit to gross revenue, normal profit to total assets,normal profit to gross revenue, net profit to gross revenue, inventories turnrounds, and added value per adjusted patient - were found to be significantly predictitive variables in Logit and Probit models. 2. The predicdtive power of discriminant model of hospital bankruptcy in 1. 2, and 3 years before bankruptcy were 85.4, 79.0, and 83.8% respectively. With regard to the predictive power of the Logit model of hospital bankruptcy, they were 82.3, 75.8, and 80.6% respectively, and of the Probit model. 87.1. 80.6, and 88.7% respectively. 3. The predictive power of the Probit model of hospital bankruptcy is better than the other two predictive models.

  • PDF

Nonlinear Models and Linear Models in Expert-Modeling A Lens Model Analysis (전문가 모델링에서 비선형모형과 선형모형 : 렌즈모형분석)

  • 김충녕
    • Journal of Intelligence and Information Systems
    • /
    • v.1 no.2
    • /
    • pp.1-16
    • /
    • 1995
  • The field of human judgment and decision making provides useful methodologies for examining the human decision making process and substantive results. One of the methodologies is a lens model analysis which can examine valid nonlinearity in the human decision making process. Using the method, valid nonlinearity in human decision behavior can be successfully detected. Two linear(statistical) models of human experts and two nonlinear models of human experts are compared in terms of predictive accuracy (predictive validity). The results indicate that nonlinear models can capture factors(valid nonlinearity) that contribute to the expert's predictive accuracy, but not factors (inconsistency) that detract from their predictive accuracy. Then, it is argued that nonlinear models cab be more accurate than linear models, or as accurate as human experts, especially when human experts employ valid nonlinear strategies in decision making.

  • PDF

Water consumption prediction based on machine learning methods and public data

  • Kesornsit, Witwisit;Sirisathitkul, Yaowarat
    • Advances in Computational Design
    • /
    • v.7 no.2
    • /
    • pp.113-128
    • /
    • 2022
  • Water consumption is strongly affected by numerous factors, such as population, climatic, geographic, and socio-economic factors. Therefore, the implementation of a reliable predictive model of water consumption pattern is challenging task. This study investigates the performance of predictive models based on multi-layer perceptron (MLP), multiple linear regression (MLR), and support vector regression (SVR). To understand the significant factors affecting water consumption, the stepwise regression (SW) procedure is used in MLR to obtain suitable variables. Then, this study also implements three predictive models based on these significant variables (e.g., SWMLR, SWMLP, and SWSVR). Annual data of water consumption in Thailand during 2006 - 2015 were compiled and categorized by provinces and distributors. By comparing the predictive performance of models with all variables, the results demonstrate that the MLP models outperformed the MLR and SVR models. As compared to the models with selected variables, the predictive capability of SWMLP was superior to SWMLR and SWSVR. Therefore, the SWMLP still provided satisfactory results with the minimum number of explanatory variables which in turn reduced the computation time and other resources required while performing the predictive task. It can be concluded that the MLP exhibited the best result and can be utilized as a reliable water demand predictive model for both of all variables and selected variables cases. These findings support important implications and serve as a feasible water consumption predictive model and can be used for water resources management to produce sufficient tap water to meet the demand in each province of Thailand.

PREDICTING KOREAN FRUIT PRICES USING LSTM ALGORITHM

  • PARK, TAE-SU;KEUM, JONGHAE;KIM, HOISUB;KIM, YOUNG ROCK;MIN, YOUNGHO
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.26 no.1
    • /
    • pp.23-48
    • /
    • 2022
  • In this paper, we provide predictive models for the market price of fruits, and analyze the performance of each fruit price predictive model. The data used to create the predictive models are fruit price data, weather data, and Korea composite stock price index (KOSPI) data. We collect these data through Open-API for 10 years period from year 2011 to year 2020. Six types of fruit price predictive models are constructed using the LSTM algorithm, a special form of deep learning RNN algorithm, and the performance is measured using the root mean square error. For each model, the data from year 2011 to year 2018 are trained to predict the fruit price in year 2019, and the data from year 2011 to year 2019 are trained to predict the fruit price in year 2020. By comparing the fruit price predictive models of year 2019 and those models of year 2020, the model with excellent efficiency is identified and the best model to provide the service is selected. The model we made will be available in other countries and regions as well.

Evaluating Predictive Ability of Classification Models with Ordered Multiple Categories

  • Oong-Hyun Sung
    • Communications for Statistical Applications and Methods
    • /
    • v.6 no.2
    • /
    • pp.383-395
    • /
    • 1999
  • This study is concerned with the evaluation of predictive ability of classification models with ordered multiple categories. If categories can be ordered or ranked the spread of misclassification should be considered to evaluate the performance of the classification models using loss rate since the apparent error rate can not measure the spread of misclassification. Since loss rate is known to underestimate the true loss rate the bootstrap method were used to estimate the true loss rate. thus this study suggests the method to evaluate the predictive power of the classification models using loss rate and the bootstrap estimate of the true loss rate.

  • PDF

Scalable Prediction Models for Airbnb Listing in Spark Big Data Cluster using GPU-accelerated RAPIDS

  • Muralidharan, Samyuktha;Yadav, Savita;Huh, Jungwoo;Lee, Sanghoon;Woo, Jongwook
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.2
    • /
    • pp.96-102
    • /
    • 2022
  • We aim to build predictive models for Airbnb's prices using a GPU-accelerated RAPIDS in a big data cluster. The Airbnb Listings datasets are used for the predictive analysis. Several machine-learning algorithms have been adopted to build models that predict the price of Airbnb listings. We compare the results of traditional and big data approaches to machine learning for price prediction and discuss the performance of the models. We built big data models using Databricks Spark Cluster, a distributed parallel computing system. Furthermore, we implemented models using multiple GPUs using RAPIDS in the spark cluster. The model was developed using the XGBoost algorithm, whereas other models were developed using traditional central processing unit (CPU)-based algorithms. This study compared all models in terms of accuracy metrics and computing time. We observed that the XGBoost model with RAPIDS using GPUs had the highest accuracy and computing time.

Development and Evaluation of Electronic Health Record Data-Driven Predictive Models for Pressure Ulcers (전자건강기록 데이터 기반 욕창 발생 예측모델의 개발 및 평가)

  • Park, Seul Ki;Park, Hyeoun-Ae;Hwang, Hee
    • Journal of Korean Academy of Nursing
    • /
    • v.49 no.5
    • /
    • pp.575-585
    • /
    • 2019
  • Purpose: The purpose of this study was to develop predictive models for pressure ulcer incidence using electronic health record (EHR) data and to compare their predictive validity performance indicators with that of the Braden Scale used in the study hospital. Methods: A retrospective case-control study was conducted in a tertiary teaching hospital in Korea. Data of 202 pressure ulcer patients and 14,705 non-pressure ulcer patients admitted between January 2015 and May 2016 were extracted from the EHRs. Three predictive models for pressure ulcer incidence were developed using logistic regression, Cox proportional hazards regression, and decision tree modeling. The predictive validity performance indicators of the three models were compared with those of the Braden Scale. Results: The logistic regression model was most efficient with a high area under the receiver operating characteristics curve (AUC) estimate of 0.97, followed by the decision tree model (AUC 0.95), Cox proportional hazards regression model (AUC 0.95), and the Braden Scale (AUC 0.82). Decreased mobility was the most significant factor in the logistic regression and Cox proportional hazards models, and the endotracheal tube was the most important factor in the decision tree model. Conclusion: Predictive validity performance indicators of the Braden Scale were lower than those of the logistic regression, Cox proportional hazards regression, and decision tree models. The models developed in this study can be used to develop a clinical decision support system that automatically assesses risk for pressure ulcers to aid nurses.

Investment, Export, and Exchange Rate on Prediction of Employment with Decision Tree, Random Forest, and Gradient Boosting Machine Learning Models (투자와 수출 및 환율의 고용에 대한 의사결정 나무, 랜덤 포레스트와 그래디언트 부스팅 머신러닝 모형 예측)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.46 no.2
    • /
    • pp.281-299
    • /
    • 2021
  • This paper analyzes the feasibility of using machine learning methods to forecast the employment. The machine learning methods, such as decision tree, artificial neural network, and ensemble models such as random forest and gradient boosting regression tree were used to forecast the employment in Busan regional economy. The following were the main findings of the comparison of their predictive abilities. First, the forecasting power of machine learning methods can predict the employment well. Second, the forecasting values for the employment by decision tree models appeared somewhat differently according to the depth of decision trees. Third, the predictive power of artificial neural network model, however, does not show the high predictive power. Fourth, the ensemble models such as random forest and gradient boosting regression tree model show the higher predictive power. Thus, since the machine learning method can accurately predict the employment, we need to improve the accuracy of forecasting employment with the use of machine learning methods.

Application of THM Predictive Model in Water Distribution System (국내 상수관로에 대한 THM 발생 예측모델의 적용)

  • Lee, Doo-Jin;Kim, Young-Il;Sohn, Jin-Sik
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.21 no.1
    • /
    • pp.3-11
    • /
    • 2007
  • THM models have been developed in several researchers in order to better understand and manage the presence of THM in water distribution system. Several developed models were demonstrated in this study for estimating THM concentrations in target water distribution system. In order to investigate the performance of developed THM models, lab and field test were investigated. Predicted THM concentrations by all kind of models were showed good correlation with observed values. When the developed models were compared with lab and field test, the Rodriguez model during tested models was most predictive than the other models.