• Title/Summary/Keyword: Logistic regression modeling

Search Result 104, Processing Time 0.026 seconds

A Study on Improving the predict accuracy rate of Hybrid Model Technique Using Error Pattern Modeling : Using Logistic Regression and Discriminant Analysis

  • Cho, Yong-Jun;Hur, Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.269-278
    • /
    • 2006
  • This paper presents the new hybrid data mining technique using error pattern, modeling of improving classification accuracy. The proposed method improves classification accuracy by combining two different supervised learning methods. The main algorithm generates error pattern modeling between the two supervised learning methods(ex: Neural Networks, Decision Tree, Logistic Regression and so on.) The Proposed modeling method has been applied to the simulation of 10,000 data sets generated by Normal and exponential random distribution. The simulation results show that the performance of proposed method is superior to the existing methods like Logistic regression and Discriminant analysis.

  • PDF

An Analysis on Relations between Design Errors Detected during BIM-based Design Validation and the Impacts Using Logistic Regression (로지스틱 회귀분석을 이용한 BIM 설계 검토에 의하여 발견된 설계 오류와 그 영향도간의 관계 분석)

  • Won, Jongsung
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2017.05a
    • /
    • pp.264-265
    • /
    • 2017
  • This paper aims to analyze relations between design errors prevented by building information modeling (BIM)-based design validation and their impacts in order to identify critical consideration factors for successfully implementing BIM-based design validation in the architecture, engineering, and construction (AEC) projects. More than 800 design errors detected by BIM-based design validation in two BIM-based projects in South Korea are categorized according to its causes and work types. The relations between causes and work types of design errors and project delay, cost overrun, low quality, and rework generation that can be caused by the errors are analyzed through conducting logistic regression. Characteristics of each design error are analyzed by conducting face-to-face interviews with practitioners in the two BIM-based projects. As the results, the impacts of design error causes on predicting project delay, cost overrun, low quality, and rework generation were the highest.

  • PDF

MARS Modeling for Ordinal Categorical Response Data: A Case Study

  • Kim, Ji-Hyun
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.3
    • /
    • pp.711-720
    • /
    • 2000
  • A case study of modeling ordinal categorical response data with the MARS method is done. The study is to analyze the effect of some personal characteristics and socioeconomic status on the teenage marijuana use. The MARS method gave a new insight into the data set.

  • PDF

An Introduction to Logistic Regression: From Basic Concepts to Interpretation with Particular Attention to Nursing Domain

  • Park, Hyeoun-Ae
    • Journal of Korean Academy of Nursing
    • /
    • v.43 no.2
    • /
    • pp.154-164
    • /
    • 2013
  • Purpose: The purpose of this article is twofold: 1) introducing logistic regression (LR), a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, and 2) examining use and reporting of LR in the nursing literature. Methods: Text books on LR and research articles employing LR as main statistical analysis were reviewed. Twenty-three articles published between 2010 and 2011 in the Journal of Korean Academy of Nursing were analyzed for proper use and reporting of LR models. Results: Logistic regression from basic concepts such as odds, odds ratio, logit transformation and logistic curve, assumption, fitting, reporting and interpreting to cautions were presented. Substantial shortcomings were found in both use of LR and reporting of results. For many studies, sample size was not sufficiently large to call into question the accuracy of the regression model. Additionally, only one study reported validation analysis. Conclusion: Nursing researchers need to pay greater attention to guidelines concerning the use and reporting of LR models.

Landslide susceptibility mapping using Logistic Regression and Fuzzy Set model at the Boeun Area, Korea (로지스틱 회귀분석과 퍼지 기법을 이용한 산사태 취약성 지도작성: 보은군을 대상으로)

  • Al-Mamun, Al-Mamun;JANG, Dong-Ho
    • Journal of The Geomorphological Association of Korea
    • /
    • v.23 no.2
    • /
    • pp.109-125
    • /
    • 2016
  • This study aims to identify the landslide susceptible zones of Boeun area and provide reliable landslide susceptibility maps by applying different modeling methods. Aerial photographs and field survey on the Boeun area identified landslide inventory map that consists of 388 landslide locations. A total ofseven landslide causative factors (elevation, slope angle, slope aspect, geology, soil, forest and land-use) were extracted from the database and then converted into raster. Landslide causative factors were provided to investigate about the spatial relationship between each factor and landslide occurrence by using fuzzy set and logistic regression model. Fuzzy membership value and logistic regression coefficient were employed to determine each factor's rating for landslide susceptibility mapping. Then, the landslide susceptibility maps were compared and validated by cross validation technique. In the cross validation process, 50% of observed landslides were selected randomly by Excel and two success rate curves (SRC) were generated for each landslide susceptibility map. The result demonstrates the 84.34% and 83.29% accuracy ratio for logistic regression model and fuzzy set model respectively. It means that both models were very reliable and reasonable methods for landslide susceptibility analysis.

Comparison of the Performance of Log-logistic Regression and Artificial Neural Networks for Predicting Breast Cancer Relapse

  • Faradmal, Javad;Soltanian, Ali Reza;Roshanaei, Ghodratollah;Khodabakhshi, Reza;Kasaeian, Amir
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.14
    • /
    • pp.5883-5888
    • /
    • 2014
  • Background: Breast cancer is the most common cancers in female populations. The exact cause is not known, but is most likely to be a combination of genetic and environmental factors. Log-logistic model (LLM) is applied as a statistical method for predicting survival and it influencing factors. In recent decades, artificial neural network (ANN) models have been increasingly applied to predict survival data. The present research was conducted to compare log-logistic regression and artificial neural network models in prediction of breast cancer (BC) survival. Materials and Methods: A historical cohort study was established with 104 patients suffering from BC from 1997 to 2005. To compare the ANN and LLM in our setting, we used the estimated areas under the receiver-operating characteristic (ROC) curve (AUC) and integrated AUC (iAUC). The data were analyzed using R statistical software. Results: The AUC for the first, second and third years after diagnosis are 0.918, 0.780 and 0.800 in ANN, and 0.834, 0.733 and 0.616 in LLM, respectively. The mean AUC for ANN was statistically higher than that of the LLM (0.845 vs. 0.744). Hence, this study showed a significant difference between the performance in terms of prediction by ANN and LLM. Conclusions: This study demonstrated that the ability of prediction with ANN was higher than with the LLM model. Thus, the use of ANN method for prediction of survival in field of breast cancer is suggested.

Evaluating seismic liquefaction potential using multivariate adaptive regression splines and logistic regression

  • Zhang, Wengang;Goh, Anthony T.C.
    • Geomechanics and Engineering
    • /
    • v.10 no.3
    • /
    • pp.269-284
    • /
    • 2016
  • Simplified techniques based on in situ testing methods are commonly used to assess seismic liquefaction potential. Many of these simplified methods were developed by analyzing liquefaction case histories from which the liquefaction boundary (limit state) separating two categories (the occurrence or non-occurrence of liquefaction) is determined. As the liquefaction classification problem is highly nonlinear in nature, it is difficult to develop a comprehensive model using conventional modeling techniques that take into consideration all the independent variables, such as the seismic and soil properties. In this study, a modification of the Multivariate Adaptive Regression Splines (MARS) approach based on Logistic Regression (LR) LR_MARS is used to evaluate seismic liquefaction potential based on actual field records. Three different LR_MARS models were used to analyze three different field liquefaction databases and the results are compared with the neural network approaches. The developed spline functions and the limit state functions obtained reveal that the LR_MARS models can capture and describe the intrinsic, complex relationship between seismic parameters, soil parameters, and the liquefaction potential without having to make any assumptions about the underlying relationship between the various variables. Considering its computational efficiency, simplicity of interpretation, predictive accuracy, its data-driven and adaptive nature and its ability to map the interaction between variables, the use of LR_MARS model in assessing seismic liquefaction potential is promising.

Modeling the Natural Occurrence of Selected Dipterocarp Genera in Sarawak, Borneo

  • Teo, Stephen;Phua, Mui-How
    • Journal of Forest and Environmental Science
    • /
    • v.28 no.3
    • /
    • pp.170-178
    • /
    • 2012
  • Dipterocarps or Dipterocarpaceae is a commercially important timber producing and dominant keystone tree family in the rain forests of Borneo. Borneo's landscape is changing at an unprecedented rate in recent years which affects this important biodiversity. This paper attempts to model the natural occurrence (distribution including those areas with natural forests before being converted to other land uses as opposed to current distribution) of dipterocarp species in Sarawak which is important for forest biodiversity conservation and management. Local modeling method of Inverse Distance Weighting was compared with commonly used statistical method (Binary Logistic Regression) to build the best natural distribution models for three genera (12 species) of dipterocarps. Database of species occurrence data and pseudoabsence data were constructed and divided into two halves for model building and validation. For logistic regression modeling, climatic, topographical and edaphic parameters were used. Proxy variables were used to represent the parameters which were highly (p>0.75) correlated to avoid over-fitting. The results show that Inverse Distance Weighting produced the best and consistent prediction with an average accuracy of over 80%. This study demonstrates that local interpolation method can be used for the modeling of natural distribution of dipterocarp species. The Inverse Distance Weighted was proven a better method and the possible reasons are discussed.

Risk factors for unexpected readmission and reoperation following open procedures for shoulder instability: a national database study of 1,942 cases

  • John M. Tarazi;Matthew J. Partan;Alton Daley;Brandon Klein;Luke Bartlett;Randy M. Cohn
    • Clinics in Shoulder and Elbow
    • /
    • v.26 no.3
    • /
    • pp.252-259
    • /
    • 2023
  • Background: The purpose of this study was to identify demographics and risk factors associated with unplanned 30-day readmission and reoperation following open procedures for shoulder instability and examine recent trends in open shoulder instability procedures. Methods: The American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) database was queried using current procedural terminology (CPT) codes 23455, 23460, and 23462 to find patients who underwent shoulder instability surgery from 2015 to 2019. Independent sample Student t-tests and chi-square tests were used in univariate analyses to identify demographic, lifestyle, and perioperative variables related to 30-day readmission following repair for shoulder instability. Multivariate logistic regression modeling was subsequently performed. Results: In total, 1,942 cases of open surgical procedures for shoulder instability were identified. Within our study sample, 1.27% of patients were readmitted within 30 days of surgery, and 0.85% required reoperation. Multivariate logistic regression modeling confirmed that the following patient variables were associated with a statistically significant increase in the odds of readmission: open anterior bone block/Latarjet-Bristow procedure, being a current smoker, and a long hospital stay (all P<0.05). Multivariate logistic regression modeling confirmed statistically significant increased odds of reoperation with an open anterior bone block or Latarjet-Bristow procedure (P<0.05). Conclusions: Unplanned 30-day readmission and reoperation after open shoulder instability surgery is infrequent. Patients who are current smokers, have an open anterior bone block or Latarjet-Bristow procedure, or a longer than average hospital stay have higher odds of readmission than others. Patients who undergo an open anterior bone block or Latarjet-Bristow procedure have higher odds of reoperation than those who undergo an open soft-tissue procedure. Level of evidence: III.

An educational tool for binary logistic regression model using Excel VBA (엑셀 VBA를 이용한 이분형 로지스틱 회귀모형 교육도구 개발)

  • Park, Cheolyong;Choi, Hyun Seok
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.2
    • /
    • pp.403-410
    • /
    • 2014
  • Binary logistic regression analysis is a statistical technique that explains binary response variable by quantitative or qualitative explanatory variables. In the binary logistic regression model, the probability that the response variable equals, say 1, one of the binary values is to be explained as a transformation of linear combination of explanatory variables. This is one of big barriers that non-statisticians have to overcome in order to understand the model. In this study, an educational tool is developed that explains the need of the binary logistic regression analysis using Excel VBA. More precisely, this tool explains the problems related to modeling the probability of the response variable equal to 1 as a linear combination of explanatory variables and then shows how these problems can be solved through some transformations of the linear combination.