Variable Selection with Log-Density in Logistic Regression Model Kahng, Myung-Wook; Shin, Eun-Young;
We present methods to study the log-density ratio of the conditional densities of the predictors given the response variable in the logistic regression model. This allows us to select which predictors are needed and how they should be included in the model. If the conditional distributions are skewed, the distributions can be considered as gamma distributions. A simulation study shows that the linear and log terms are required in general. If the conditional distributions of xjy for the two groups overlap significantly, we need both the linear and log terms; however, only the linear or log term is needed in the model if they are well separated.
Log-density Ratio with Two Predictors in a Logistic Regression Model, Korean Journal of Applied Statistics, 2013, 26, 1, 141
Clark, R. G., Henderson, H. V., Hoggard, G. K., Ellison, R. S. and Young, B. J. (1987). The ability of biochemical and haematological tests to predict recovery in periparturient recumbent cows, New Zealand Veterinary Journal, 35, 126-133.
Cook, R. D. andWeisberg, S. (1999). Applied Regression Including Computing and Graphics,Wiley, New York.
Kay, R. and Little, S. (1987). Transformations of the explanatory variables in the logistic regression model for binary data, Biometrika, 74, 495-501.
Kullback, S. (1959). Information Theory and Statistics, Wiley, New York.
Nelder, J. A. andWedderburn, R.W. M. (1972). Generalized linear models, Journal of the Royal Statistical Society: Series A (Statistics in Society), 135, 370-384.
Scrucca, L. (2003). Graphics for studying logistics regression models, Statistical Methods and Applications, 11, 371-394.
Scrucca, L. andWeisberg, S. (2004). A simulation study to investigate the behavior of the log-density ratio under normality, Communication in Statistics Simulation and Computation, 33, 159-178.