• Title/Summary/Keyword: logistic regression models

Search Result 617, Processing Time 0.032 seconds

Multiple Deletions in Logistic Regression Models

  • Jung, Kang-Mo
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.2
    • /
    • pp.309-315
    • /
    • 2009
  • We extended the results of Roy and Guria (2008) to multiple deletions in logistic regression models. Since single deletions may not exactly detect outliers or influential observations due to swamping effects and masking effects, it needs multiple deletions. We developed conditional deletion diagnostics which are designed to overcome problems of masking effects. We derived the closed forms for several statistics in logistic regression models. They give useful diagnostics on the statistics.

Prediction on Busan's Gross Product and Employment of Major Industry with Logistic Regression and Machine Learning Model (로지스틱 회귀모형과 머신러닝 모형을 활용한 주요산업의 부산 지역총생산 및 고용 효과 예측)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.47 no.2
    • /
    • pp.69-88
    • /
    • 2022
  • This paper aims to predict Busan's regional product and employment using the logistic regression models and machine learning models. The following are the main findings of the empirical analysis. First, the OLS regression model shows that the main industries such as electricity and electronics, machine and transport, and finance and insurance affect the Busan's income positively. Second, the binomial logistic regression models show that the Busan's strategic industries such as the future transport machinery, life-care, and smart marine industries contribute on the Busan's income in large order. Third, the multinomial logistic regression models show that the Korea's main industries such as the precise machinery, transport equipment, and machinery influence the Busan's economy positively. And Korea's exports and the depreciation can affect Busan's economy more positively at the higher employment level. Fourth, the voting ensemble model show the higher predictive power than artificial neural network model and support vector machine models. Furthermore, the gradient boosting model and the random forest show the higher predictive power than the voting model in large order.

Penalized logistic regression models for determining the discharge of dyspnea patients (호흡곤란 환자 퇴원 결정을 위한 벌점 로지스틱 회귀모형)

  • Park, Cheolyong;Kye, Myo Jin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.125-133
    • /
    • 2013
  • In this paper, penalized binary logistic regression models are employed as statistical models for determining the discharge of 668 patients with a chief complaint of dyspnea based on 11 blood tests results. Specifically, the ridge model based on $L^2$ penalty and the Lasso model based on $L^1$ penalty are considered in this paper. In the comparison of prediction accuracy, our models are compared with the logistic regression models with all 11 explanatory variables and the selected variables by variable selection method. The results show that the prediction accuracy of the ridge logistic regression model is the best among 4 models based on 10-fold cross-validation.

Collapsibility and Suppression for Cumulative Logistic Model

  • Hong, Chong-Sun;Kim, Kil-Tae
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.313-322
    • /
    • 2005
  • In this paper, we discuss suppression for logistic regression model. Suppression for linear regression model was defined as the relationship among sums of squared for regression as well as correlation coefficients of. variables. Since it is not common to obtain simple correlation coefficient for binary response variable of logistic model, we consider cumulative logistic models with multinomial and ordinal response variables rather than usual logistic model. As number of category of a response variable for the cumulative logistic model gets collapsed into binary, it is found that suppressions for these logistic models are changed. These suppression results for cumulative logistic models are discussed and compared with those of linear model.

Comparative Study on Statistical Packages for Analyzing Logistic Regression - MINITAB, SAS, SPSS, STATA -

  • Kim, Soon-Kwi;Jeong, Dong-Bin;Park, Young-Sool
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.2
    • /
    • pp.367-378
    • /
    • 2004
  • Recently logistic regression is popular in a variety of fields so that a number of statistical packages are developed for analyzing the logistic regression. This paper briefly considers the several types of logistic regression models used depending on different types of data. In addition, when four statistical packages (MINTAB, SAS, SPSS and STATA) are used to apply logistic regression models to the real fields respectively, their scope and characteristics are investigated.

  • PDF

Comparison of Regression Models for Estimating Ventilation Rate of Mechanically Ventilated Swine Farm (강제환기식 돈사의 환기량 추정을 위한 회귀모델의 비교)

  • Jo, Gwanggon;Ha, Taehwan;Yoon, Sanghoo;Jang, Yuna;Jung, Minwoong
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.62 no.1
    • /
    • pp.61-70
    • /
    • 2020
  • To estimate the ventilation volume of mechanically ventilated swine farms, various regression models were applied, and errors were compared to select the regression model that can best simulate actual data. Linear regression, linear spline, polynomial regression (degrees 2 and 3), logistic curve, generalized additive model (GAM), and gompertz curve were compared. Overfitting models were excluded even when the error rate was small. The evaluation criteria were root mean square error (RMSE) and mean absolute percentage error (MAPE). The evaluation results indicated that degree 3 exhibited the lowest error rate; however, an overestimation contradiction was observed in a certain section. The logistic curve was the most stable and superior to all the models. In the estimation of ventilation volume by all of the models, the estimated ventilation volume of the logistic curve was the smallest except for the model with a large error rate and the overestimated model.

Estimating small area proportions with kernel logistic regressions models

  • Shim, Jooyong;Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.941-949
    • /
    • 2014
  • Unit level logistic regression model with mixed effects has been used for estimating small area proportions, which treats the spatial effects as random effects and assumes linearity between the logistic link and the covariates. However, when the functional form of the relationship between the logistic link and the covariates is not linear, it may lead to biased estimators of the small area proportions. In this paper, we relax the linearity assumption and propose two types of kernel-based logistic regression models for estimating small area proportions. We also demonstrate the efficiency of our propose models using simulated data and real data.

Suppression for Logistic Regression Model (로지스틱 회귀모형에서의 SUPPRESSION)

  • Hong C. S.;Kim H. I.;Ham J. H.
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.701-712
    • /
    • 2005
  • The suppression for logistic regression models has been debated no longer than that for linear regression models since, among many other reasons, sum of squares for regression (SSR) or coefficient of determination ($R^2$) could be defined into various ways. Based on four kinds of $R^2$'s: two kinds are most preferred, and the other two are proposed by Liao & McGee (2003), four kinds of SSR's are derived so that the suppression for logistic models is explained. Many data fitted to logistic models are generated by Monte Carlo method. We explore when suppression happens, and compare with that for linear regression models.

Comparison of Classification Models for Sequential Flight Test Results (단계별 비행훈련 성패 예측 모형의 성능 비교 연구)

  • Sohn, So-Young;Cho, Yong-Kwan;Choi, Sung-Ok;Kim, Young-Joun
    • Journal of the Ergonomics Society of Korea
    • /
    • v.21 no.1
    • /
    • pp.1-14
    • /
    • 2002
  • The main purpose of this paper is to present selection criteria for ROK Airforce pilot training candidates in order to save costs involved in sequential pilot training. We use classification models such Decision Tree, Logistic Regression and Neural Network based on aptitude test results of 288 ROK Air Force applicants in 1994-1996. Different models are compared in terms of classification accuracy, ROC and Lift-value. Neural network is evaluated as the best model for each sequential flight test result while Logistic regression model outperforms the rest of them for discriminating the last flight test result. Therefore we suggest a pilot selection criterion based on this logistic regression. Overall. we find that the factors such as Attention Sharing, Speed Tracking, Machine Comprehension and Instrument Reading Ability having significant effects on the flight results. We expect that the use of our criteria can increase the effectiveness of flight resources.

Study on Accident Prediction Models in Urban Railway Casualty Accidents Using Logistic Regression Analysis Model (로지스틱회귀분석 모델을 활용한 도시철도 사상사고 사고예측모형 개발에 대한 연구)

  • Jin, Soo-Bong;Lee, Jong-Woo
    • Journal of the Korean Society for Railway
    • /
    • v.20 no.4
    • /
    • pp.482-490
    • /
    • 2017
  • This study is a railway accident investigation statistic study with the purpose of prediction and classification of accident severity. Linear regression models have some difficulties in classifying accident severity, but a logistic regression model can be used to overcome the weaknesses of linear regression models. The logistic regression model is applied to escalator (E/S) accidents in all stations on 5~8 lines of the Seoul Metro, using data mining techniques such as logistic regression analysis. The forecasting variables of E/S accidents in urban railway stations are considered, such as passenger age, drinking, overall situation, behavior, and handrail grip. In the overall accuracy analysis, the logistic regression accuracy is explained 76.7%. According to the results of this analysis, it has been confirmed that the accuracy and the level of significance of the logistic regression analysis make it a useful data mining technique to establish an accident severity prediction model for urban railway casualty accidents.