DOI QR코드

DOI QR Code

Weighted Support Vector Machines with the SCAD Penalty

  • Jung, Kang-Mo (Department of Statistics and Computer Science, Kunsan National University)
  • Received : 2013.10.01
  • Accepted : 2013.10.31
  • Published : 2013.11.30

Abstract

Classification is an important research area as data can be easily obtained even if the number of predictors becomes huge. The support vector machine(SVM) is widely used to classify a subject into a predetermined group because it gives sound theoretical background and better performance than other methods in many applications. The SVM can be viewed as a penalized method with the hinge loss function and penalty functions. Instead of $L_2$ penalty function Fan and Li (2001) proposed the smoothly clipped absolute deviation(SCAD) satisfying good statistical properties. Despite the ability of SVMs, they have drawbacks of non-robustness when there are outliers in the data. We develop a robust SVM method using a weight function with the SCAD penalty function based on the local quadratic approximation. We compare the performance of the proposed SVM with the SVM using the $L_1$ and $L_2$ penalty functions.

Keywords

References

  1. Bradley, P. S. and Mangasarian, O. L. (1998). Feature selection via concave minimization and support vector machines, Proceedings of the 13th International Conference on Machine Learning, 82-90
  2. Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression, Annals of Statistics, 32, 407-499. https://doi.org/10.1214/009053604000000067
  3. Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle prop-erties, Journal of the American Statistical Association, 96, 1348-1360. https://doi.org/10.1198/016214501753382273
  4. Fan, J., Xue, L. and Zou, H. (2013). Strong oracle optimality of folded concave penalized estimation, Unpublished manuscript.
  5. Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer.
  6. Jung, K.-M. (2008). Robust statistical methods in variable selection, Journal of the Korean Data Analysis Society, 10, 3057-3066.
  7. Jung, K.-M. (2011). Weighted least absolute deviation lasso estimator, Communications of the Korean Statistical Society, 18, 733-739. https://doi.org/10.5351/CKSS.2011.18.6.733
  8. Jung, K.-M. (2012). Multiclass support vector machines with SCAD, Communications of the Korean Statistical Society, 19, 655-662. https://doi.org/10.5351/CKSS.2012.19.5.655
  9. Kim, Y., Choi, H. and Oh, H.-S. (2008). Smoothly clipped absolute deviation on high dimensions, Journal of American Statistical Association, 103, 1665-1673. https://doi.org/10.1198/016214508000001066
  10. Lee, Y., Lin, Y. and Wahba, G. (2004). Multicategory support vector machines, theory and appli-cations to the classification of microarray data and satellite radiance data, Journal of American Statistical Association, 99, 67-81. https://doi.org/10.1198/016214504000000098
  11. Liu, Y. and Shen, X. (2006). Multicategory $\psi$-Learning, Journal of American Statistical Association, 101, 500-509. https://doi.org/10.1198/016214505000000781
  12. Park, C., Kim, K.-R., Myung, R. and Koo, J.-Y. (2012). Oracle properties of SCAD-penalized support vector machine, Journal of Statistical Planning and Inference, 142, 2257-2270. https://doi.org/10.1016/j.jspi.2012.03.002
  13. Tibshirani, R. J. (1996). Regression shrinkage and selection via the LASSO, Journal of the Royal Statistical Society, Series B, 58, 267-288.
  14. Vapnik, V. (1995). The Nature of Statistical Learning Theory, Springer.
  15. Wahba, G. (1998). Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV, In Advances in Kernel Methods: Support Vector Learning, eds. B. Scholkopf, C. J. C. Burges, and A. J. Smola, Cambridge, MA:MIT Press, 125-143.
  16. Weston, J. and Watkins, C. (1999). Support vector machines for multi-class pattern recognition, Proceedings of the Seventh European Symposium on Artificial Neural Networks.
  17. Wu, Y. and Liu, Y. (2007). Robust truncated-hinge-loss support vector machines. Journal of the American Statistical Association, 102, 974-983. https://doi.org/10.1198/016214507000000617
  18. Wu, Y. and Liu, Y. (2013). Adaptively weighted large margin classifiers. Journal of Computational and Graphical Statistics, 22, 416-432. https://doi.org/10.1080/10618600.2012.680866
  19. Zhang, H. H., Ahn, J., Lin, X. and Park, C. (2006). Gene selection using support vector machines with non-convex penalty, Bioinformatics, 22, 88-95. https://doi.org/10.1093/bioinformatics/bti736
  20. Zhu, J., Rosset, S., Hastie, T. and Tibshirani, R. (2003). 1-norm support vector machines, In Advances in Neural Information Processing Systems 16, eds, S. Thrun, L. Saul and B. Scholkopf, Cambridge, MA:MIT Press, 49-56.
  21. Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models (with discussion), The Annals of Statistics, 36, 1509-1566. https://doi.org/10.1214/009053607000000802

Cited by

  1. Support Vector Machines for Unbalanced Multicategory Classification vol.2015, 2015, https://doi.org/10.1155/2015/294985
  2. Penalized rank regression estimator with the smoothly clipped absolute deviation function vol.24, pp.6, 2017, https://doi.org/10.29220/CSAM.2017.24.6.673