Weighted Support Vector Machines with the SCAD Penalty

Title & Authors
Weighted Support Vector Machines with the SCAD Penalty
Jung, Kang-Mo;

Abstract
Classification is an important research area as data can be easily obtained even if the number of predictors becomes huge. The support vector machine(SVM) is widely used to classify a subject into a predetermined group because it gives sound theoretical background and better performance than other methods in many applications. The SVM can be viewed as a penalized method with the hinge loss function and penalty functions. Instead of $\small{L_2}$ penalty function Fan and Li (2001) proposed the smoothly clipped absolute deviation(SCAD) satisfying good statistical properties. Despite the ability of SVMs, they have drawbacks of non-robustness when there are outliers in the data. We develop a robust SVM method using a weight function with the SCAD penalty function based on the local quadratic approximation. We compare the performance of the proposed SVM with the SVM using the $\small{L_1}$ and $\small{L_2}$ penalty functions.
Keywords
Local quadratic approximation;multiclass support vector machine;penalized function;smoothly clipped absolute deviation;robustness;weight function;
Language
English
Cited by
1.
Support Vector Machines for Unbalanced Multicategory Classification, Mathematical Problems in Engineering, 2015, 2015, 1
References
1.
Bradley, P. S. and Mangasarian, O. L. (1998). Feature selection via concave minimization and support vector machines, Proceedings of the 13th International Conference on Machine Learning, 82-90

2.
Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression, Annals of Statistics, 32, 407-499.

3.
Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle prop-erties, Journal of the American Statistical Association, 96, 1348-1360.

4.
Fan, J., Xue, L. and Zou, H. (2013). Strong oracle optimality of folded concave penalized estimation, Unpublished manuscript.

5.
Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer.

6.
Jung, K.-M. (2008). Robust statistical methods in variable selection, Journal of the Korean Data Analysis Society, 10, 3057-3066.

7.
Jung, K.-M. (2011). Weighted least absolute deviation lasso estimator, Communications of the Korean Statistical Society, 18, 733-739.

8.
Jung, K.-M. (2012). Multiclass support vector machines with SCAD, Communications of the Korean Statistical Society, 19, 655-662.

9.
Kim, Y., Choi, H. and Oh, H.-S. (2008). Smoothly clipped absolute deviation on high dimensions, Journal of American Statistical Association, 103, 1665-1673.

10.
Lee, Y., Lin, Y. and Wahba, G. (2004). Multicategory support vector machines, theory and appli-cations to the classification of microarray data and satellite radiance data, Journal of American Statistical Association, 99, 67-81.

11.
Liu, Y. and Shen, X. (2006). Multicategory $\psi$-Learning, Journal of American Statistical Association, 101, 500-509.

12.
Park, C., Kim, K.-R., Myung, R. and Koo, J.-Y. (2012). Oracle properties of SCAD-penalized support vector machine, Journal of Statistical Planning and Inference, 142, 2257-2270.

13.
Tibshirani, R. J. (1996). Regression shrinkage and selection via the LASSO, Journal of the Royal Statistical Society, Series B, 58, 267-288.

14.
Vapnik, V. (1995). The Nature of Statistical Learning Theory, Springer.

15.
Wahba, G. (1998). Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV, In Advances in Kernel Methods: Support Vector Learning, eds. B. Scholkopf, C. J. C. Burges, and A. J. Smola, Cambridge, MA:MIT Press, 125-143.

16.
Weston, J. and Watkins, C. (1999). Support vector machines for multi-class pattern recognition, Proceedings of the Seventh European Symposium on Artificial Neural Networks.

17.
Wu, Y. and Liu, Y. (2007). Robust truncated-hinge-loss support vector machines. Journal of the American Statistical Association, 102, 974-983.

18.
Wu, Y. and Liu, Y. (2013). Adaptively weighted large margin classifiers. Journal of Computational and Graphical Statistics, 22, 416-432.

19.
Zhang, H. H., Ahn, J., Lin, X. and Park, C. (2006). Gene selection using support vector machines with non-convex penalty, Bioinformatics, 22, 88-95.

20.
Zhu, J., Rosset, S., Hastie, T. and Tibshirani, R. (2003). 1-norm support vector machines, In Advances in Neural Information Processing Systems 16, eds, S. Thrun, L. Saul and B. Scholkopf, Cambridge, MA:MIT Press, 49-56.

21.
Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models (with discussion), The Annals of Statistics, 36, 1509-1566.