A Study on Applying Shrinkage Method in Generalized Additive Model Ki, Seung-Do; Kang, Kee-Hoon;
Generalized additive model(GAM) is the statistical model that resolves most of the problems existing in the traditional linear regression model. However, overfitting phenomenon can be aroused without applying any method to reduce the number of independent variables. Therefore, variable selection methods in generalized additive model are needed. Recently, Lasso related methods are popular for variable selection in regression analysis. In this research, we consider Group Lasso and Elastic net models for variable selection in GAM and propose an algorithm for finding solutions. We compare the proposed methods via Monte Carlo simulation and applying auto insurance data in the fiscal year 2005. lt is shown that the proposed methods result in the better performance.
Journal of the Korean Data and Information Science Society, 2012. vol.23. 2, pp.235-245
Bakin, S. (1999). Adaptive regression and model selection in data mining problems, Ph.D. Dissertation, The Australian National University, Canberra.
Fu, W. (1998). Penalized regressions; The Bridge versus the Lasso, Journal of Computational and Graphical Statistics, 7, 397-416.
Genkin, A., Lewis, D. D. and Madigan, D. (2007). Large-scale bayesian logistic regression for text categorization, Technometrics, 49, 291-304.
Hastie, T. and Tibshirani, R. (1986). Generalized additive models (with discussion), Statistical Science, 1, 297-318.
Kim, Y., Kim, J. and Kim, Y. (2006). Blockwise sparse regression, Statistica Sinica, 16, 375-390.
Krishnapuram, B., Carin, L., Figueiredo, M. A. and Hartemink, A. J. (2005). Sparse multinomial logistic regression; Fast algorithms and generalization bounds, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 957-968.
Lokhorst, J. (1999). The Lasso and generalized linear models, Honors Project, University of Adelaide, Adelaide.
Meier, L., van de Geer, S. and Buhlmann, P. (2008). The Group Lasso for logistic regression, Journal of the Royal Statistical Society, 70, 53-71.
Roth, V. (2004). The generalized Lasso, IEEE Transactions on Neural Networks, 15, 16-28.
Shevade, S. and Keerthi, S. (2003). A simple and efficient algorithm for gene selection using sparse logistic regression, Bioinformatics, 19, 2246-2253.
Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society, 68, 49-67.
Zhao, P., Rocha, G. and Yu, B. (2006). Grouped and hierarchical model selection through composite absolute penalties, Technical Report, University of California at Berkeley, Department of Statistics.
Zhou, N. and Zhu, J. (2007). Group variable selection via hierarchical Lasso and its oracle property, manuscript.
Zou, H. and Hastie, T. (2005). Regularization and variable selection via the Elastic net, Journal of the Royal Statistical Society, 67, 301-320.