JOURNAL BROWSE
Search
Advanced SearchSearch Tips
Principal Components Logistic Regression based on Robust Estimation
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Principal Components Logistic Regression based on Robust Estimation
Kim, Bu-Yong; Kahng, Myung-Wook; Jang, Hea-Won;
  PDF(new window)
 Abstract
Logistic regression is widely used as a datamining technique for the customer relationship management. The maximum likelihood estimator has highly inflated variance when multicollinearity exists among the regressors, and it is not robust against outliers. Thus we propose the robust principal components logistic regression to deal with both multicollinearity and outlier problem. A procedure is suggested for the selection of principal components, which is based on the condition index. When a condition index is larger than the cutoff value obtained from the model constructed on the basis of the conjoint analysis, the corresponding principal component is removed from the logistic model. In addition, we employ an algorithm for the robust estimation, which strives to dampen the effect of outliers by applying the appropriate weights and factors to the leverage points and vertical outliers identified by the V-mask type criterion. The Monte Carlo simulation results indicate that the proposed procedure yields higher rate of correct classification than the existing method.
 Keywords
Datamining;multicollinearity;outlier;principal components logistic regression;robust estimation;
 Language
Korean
 Cited by
1.
Diet and Lifestyle Factors Affecting Obesity: A Korea National Health and Nutrition Survey Analysis,;;;;

Preventive Nutrition and Food Science, 2011. vol.16. 2, pp.117-126 crossref(new window)
1.
Diet and Lifestyle Factors Affecting Obesity: A Korea National Health and Nutrition Survey Analysis, Preventive Nutrition and Food Science, 2011, 16, 2, 117  crossref(new windwow)
 References
1.
Aguilera, A. M., Escabias, M. and Valderrama, M. J. (2006). Using principal components for estimating logistic regression with high-dimensional multicollinear data, Computational Statistics & Data Analysis, 50, 1905-1924 crossref(new window)

2.
Carroll, R. J. and Pederson, S. (1993), On robustness in the logistic regression model, Journal of the Royal Statistical Society, Series E, 55, 693-706

3.
Copas, J. B. (1988). Binary regression models for contaminated data, Journal of the Royal Statistical Society, Series E, 50, 225-265

4.
Croux, C. and Haesbroeck, G. (2003). Implementing the Bianco and Yohai estimator for logistic regression, Computational Statistics & Data Analysis, 44, 273-295 crossref(new window)

5.
Hadi, A. S. (1994). A modification of a method for the detection of outliers in multivariate samples, Journal of the Royal Statistical Society, Series E, 56, 393-396

6.
Hardin, J. and Rocke, D. M. (2004). Outlier detection in the multiple cluster setting using the minimum covariance determinant estimator, Computational Statistics & Data Analysis, 44, 625-638 crossref(new window)

7.
Kim, B. Y. (2005). V-mask type criterion for identification of outliers in logistic regression, The Korean Communications in Statistics, 12, 625-634 crossref(new window)

8.
Kim, B. Y. and Kahng, M. W. (2008). Principal components regression in logistic model, The Korean Journal of Applied Statistics, 21, 571-580 crossref(new window)

9.
Kim, B. Y., Kahng, M. W. and Choi, M. A. (2007). Algorithm for the robust estimation in logistic regression, The Korean Journal of Applied Statistics, 20, 551-559 crossref(new window)

10.
Kordzakhia, N., Mishra, G. D. and Reiersolmoen, L. (2001). Robust estimation in the logistic regression model, Journal of Statistical Planning and Inference, 98, 211-223 crossref(new window)

11.
Mason, R. L. and Gunst, R. F. (1985). Selecting principal components in regression, Statistics & Probability Letters, 3, 299-301 crossref(new window)

12.
Rousseeuw, P. J. and Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator, Technometrics, 41, 212-223 crossref(new window)

13.
Rousseeuw, P. J. and Leroy, A. M. (2003). Robust Regression and Outlier Detection, John Wiley & Sons, New York

14.
Schaefer, R. L. (1986). Alternative estimators in logistic regression when the data are collinear, Journal of Statistical Computation and Simulations, 25, 75-91 crossref(new window)

15.
Woodruff, D. L. and Rocke, D. M. (1994). Computable robust estimation of multivariate location and shape in high dimension using compound estimators, Journal of the American Statistical Association, 89, 888-896 crossref(new window)