DOI QR코드

DOI QR Code

Feature selection in the semivarying coefficient LS-SVR

  • Received : 2017.02.22
  • Accepted : 2017.03.27
  • Published : 2017.03.31

Abstract

In this paper we propose a feature selection method identifying important features in the semivarying coefficient model. One important issue in semivarying coefficient model is how to estimate the parametric and nonparametric components. Another issue is how to identify important features in the varying and the constant effects. We propose a feature selection method able to address this issue using generalized cross validation functions of the varying coefficient least squares support vector regression (LS-SVR) and the linear LS-SVR. Numerical studies indicate that the proposed method is quite effective in identifying important features in the varying and the constant effects in the semivarying coefficient model.

Keywords

Acknowledgement

Supported by : National Research Foundation of Korea (NRF)

References

  1. Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerical Mathematics, 31, 377-403.
  2. Fan, J. and Huang, T. (2005). Pro le likelihood inferences on semiparametric varying coefficient partially linear models. Bernoulli , 11, 1031-1057. https://doi.org/10.3150/bj/1137421639
  3. Fan, J. and Zhang, W. (2008). Statistical methods with varying coefficient models. Statistics and Its Interface, 1, 179-195. https://doi.org/10.4310/SII.2008.v1.n1.a15
  4. Hastie, T. and Tibshirani, R. (1993). Varying-coefficient models. Journal of the Royal Statistical Society B, 55, 757-796.
  5. Hoover, D. R., Rice, J. A., Wu, C. O. and Yang, L. P. (1998). Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika, 85, 809-822. https://doi.org/10.1093/biomet/85.4.809
  6. Huang, J., Ma, S. and Xie, H. (2005). Regularized estimation in the accelerated failure time model with high dimensional covariates, Technical Report No. 349, Department of Statistics and Actuarial Science, The University of Iowa, IA, USA.
  7. Hwang, C. and Shim, J. (2016). Deep LS-SVM for regression. Journal of the Korean Data & Information Science Society, 27, 827-833. https://doi.org/10.7465/jkdi.2016.27.3.827
  8. Hwang, C., Bae, J. and Shim, J. (2016). Robust varying coefficient model using L1 penalized locally weighted regression. Journal of the Korean Data & Information Science Society, 27, 1059-1066. https://doi.org/10.7465/jkdi.2016.27.4.1059
  9. Lee, Y. K., Mammen, E. and Park, B. U. (2012). Flexible generalized varying coefficient regression models. Annals of Statistics, 40, 1906-1933. https://doi.org/10.1214/12-AOS1026
  10. Li, Q. and Racine, J. S. (2010). Smooth varying-coefficient estimation and inference for qualitative and quantitative data. Econometric Theory, 26, 1607-1637. https://doi.org/10.1017/S0266466609990739
  11. Mercer. J. (1909) Function of positive and negative type and their connection with theory of integral equations. Philosophical Transactions of Royal Society A, 415-446.
  12. Sauerbrei, W. and Schumacher, M. (1992). A bootstrap resampling procedure for model building: Application to the Cox regression model. Statistical Medicine, 11, 2093-2099. https://doi.org/10.1002/sim.4780111607
  13. Shim, J. and Hwang, C. (2015). Varying coefficient modeling via least squares support vector regression. Neurocomputing, 161, 254-259. https://doi.org/10.1016/j.neucom.2015.02.036
  14. Suykens, J. A. K. and Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural Processing Letters, 9, 293-300. https://doi.org/10.1023/A:1018628609742
  15. Suykens, J. A. K., Vandewalle, J. and De Moor, B. (2001). Optimal control by least squares support vector machines. Neural Networks, 14, 23-35. https://doi.org/10.1016/S0893-6080(00)00077-0
  16. Tibshirani, R. (1997). The lasso method for variable selection in the Cox model. Statistics in Medicine, 16, 385-395. https://doi.org/10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3
  17. Vapnik, V. (1995). The nature of statistical learning theory, Springer, Berlin.
  18. Vapnik, V. (1998). Statistical Learning Theory, Wiley, New York.
  19. Wooldridge, J. M. (2012). Introductory econometrics: A modern approach, South-Western Cengage Learning, Mason.
  20. Wu, C., Shi. X., Cui, Y. and Ma, S. (2015) A penalized robust semiparametric approach for gene-environment interactions. Statistics in Medicine, 34, 4016-4030. https://doi.org/10.1002/sim.6609
  21. Xue, L. and Qu, A. (2012). Variable selection in high-dimensional varying-coefficient models with global optimality. Journal of Machine Learning Research, 13, 1973-1998.
  22. Yang, L., Park, B. U., Xue, L. and H a rdle, W. (2006). Estimation and testing for varying coefficients in additive models with marginal integration. Journal of the American Statistical Association, 101, 1212-1227. https://doi.org/10.1198/016214506000000429
  23. Zhang, W., Lee, S. and Song, X. (2002). Local polynomial tting in semivarying coefficient models. Journal of Multivariate Analysis, 82, 166-188. https://doi.org/10.1006/jmva.2001.2012

Cited by

  1. A study on a composite support vector quantile regression with varying coefficient model vol.29, pp.4, 2017, https://doi.org/10.7465/jkdi.2018.29.4.1077
  2. 심층 다중 커널 최소제곱 서포트 벡터 회귀 기계 vol.29, pp.4, 2017, https://doi.org/10.7465/jkdi.2018.29.4.895