DOI QR코드

DOI QR Code

Modifying linearly non-separable support vector machine binary classifier to account for the centroid mean vector

  • Received : 2022.06.10
  • Accepted : 2022.09.20
  • Published : 2023.05.31

Abstract

This study proposes a modification to the objective function of the support vector machine for the linearly non-separable case of a binary classifier yi ∈ {-1, 1}. The modification takes into account the position of each data item xi from its corresponding class centroid. The resulting optimization function involves the centroid mean vector, and the spread of data besides the support vectors, which should be minimized by the choice of hyper-plane β. Theoretical assumptions have been tested to derive an optimal separable hyperplane that yields the minimal misclassification rate. The proposed method has been evaluated using simulation studies and real-life COVID-19 patient outcome hospitalization data. Results show that the proposed method performs better than the classical linear SVM classifier as the sample size increases and is preferred in the presence of correlations among predictors as well as among extreme values.

Keywords

Acknowledgement

The authors would like to acknowledge the Department of Statistics, Sultan Qaboos University, especially for providing a conducive working research environment. Further appreciation goes to The Royal Hospital, Sultanate of Oman for availing the data we used to validate our novel SVM classifier.

References

  1. Al-Shukeili M and Wesonga R (2021). A novel minimization approximation cost classification method to minimize misclassification rate for dichotomous and homogeneous classes, RMS: Research in Mathematics & Statistics, 8, 1-11.
  2. Buhlmann P and Yu B (2003). Boosting with the L2 loss: Regression and classification, Journal of the American Statistical Association, 98, 324-339. https://doi.org/10.1198/016214503000125
  3. Brooks JP (2011). Support vector machines with the ramp loss and the hard margin loss, Operations Research, 59, 467-479. https://doi.org/10.1287/opre.1100.0854
  4. Cabana Garceran del Vall E, Henry LR, and Lillo Rodriguez RE (2017). Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators, Available from: http://hdl.handle.net/10016/24613
  5. Collobert R, Sinz F, Weston J, and Bottou L (2006). Trading convexity for scalability, Proceedings of the 23rd International Conference on Machine Learning, 201-208.
  6. Debruyne M (2009). An outlier map for support vector machine classification, The Annals of Applied Statistics, 3, 1566-1580. https://doi.org/10.1214/09-AOAS256
  7. Dormann CF, Elith J, Bacher S et al. (2013). Collinearity: A review of methods to deal with it and a simulation study evaluating their performance, Ecography, 36, 27-46. https://doi.org/10.1111/j.1600-0587.2012.07348.x
  8. Ghaddar B and Joe N-S (2018). High dimensional data classification and feature selection using support vector machines, European Journal of Operational Research, 265, 993-1004. https://doi.org/10.1016/j.ejor.2017.08.040
  9. Gordon G and Tibshirani R (2012). Karush-Kuhn-Tucker conditions, Optimization, 725, 10-36.
  10. Han L, Han L, and Zhao H (2013). Orthogonal support vector machine for credit scoring, Engineering Applications of Artificial Intelligence, 26, 848-862. https://doi.org/10.1016/j.engappai.2012.10.005
  11. Izenman AJ (2008). Modern Multivariate Statistical Techniques: Regression, Classification and Manifold Learning, Springer New York, New York.
  12. Jarray F, Boughorbel S, Mansour M, and Tlig G (2018). A step loss function based SVM classifier for binary classification, Procedia Computer Science, 141, 9-15. https://doi.org/10.1016/j.procs.2018.10.123
  13. Jiang P, Missoum S, and Chen Z (2014). Optimal SVM parameter selection for non-separable and unbalanced datasets, Structural and Multidisciplinary Optimization, 50, 523-535. https://doi.org/10.1007/s00158-014-1105-z
  14. Khamis F, Awaidy SA, Shaaibi MA et al. (2021). Epidemiological characteristics of hospitalized patients with moderate versus severe COVID-19 infection: A retrospective cohort single centre study, Diseases, 10, 1-16. https://doi.org/10.3390/diseases10010001
  15. Shuxia L, Xizhao W, Guiqiang Z, and Xu Z (2015). Effective algorithms of the Moore-Penrose inverse matrices for extreme learning machine, Intelligent Data Analysis, 19, 743-760. https://doi.org/10.3233/IDA-150743
  16. Orsenigo C and Vercellis C (2003). Multivariate classification trees based on minimum features discrete support vector machines, IMA Journal of Management Mathematics, 14, 221-234. https://doi.org/10.1093/imaman/14.3.221
  17. Ozcan NO and Gurgen F (2010). Fuzzy support vector machines for ECG arrhythmia detection, In Proceedings of 20th IEEE International Conference on Pattern Recognition, Istanbul, Turkey, 2973-2976.
  18. Perez-Cruz F, Bousono-Calzon C, and Artes-Rodriguez A (2005). Convergence of the IRWLS procedure to the support vector machine solution, Neural Computation, 17, 7-18. https://doi.org/10.1162/0899766052530875
  19. Rencher AC (2003). Methods of Multivariate Analysis (2nd ed), Wiley Hoboken, New Jersey.
  20. Shen X, Tseng GC, Zhang X, and Wong WH (2003). On ψ-learning, Journal of the American Statistical Association, 98, 724-734. https://doi.org/10.1198/016214503000000639
  21. Shinozaki N, Masaaki S, and Kunio T (1972). Numerical algorithms for the Moore-Penrose inverse of a matrix: Direct methods, Annals of the Institute of Statistical Mathematics, 24, 193-203. https://doi.org/10.1007/BF02479751
  22. Siqueira LFS, Morais CLM, Junior RFA, Araujo AA, and Lima KMG (2018). SVM for FT-MIR prostate cancer classification: An alternative to the traditional methods, Journal of Chemometrics, 32, e3075.
  23. Steinwart I (2001). On the influence of the kernel on the consistency of support vector machines, Journal of Machine Learning Research, 2, 67-93.
  24. Steinwart I (2002). Support vector machines are universally consistent, Journal of Complexity, 18, 768-791. https://doi.org/10.1006/jcom.2002.0642
  25. Tang Y, Zhang Y-Q, Chawla NV, and Krasser S (2008). SVMs modeling for highly imbalanced classification, IEEE Transactions on Systems, Man, and Cybernetics, 39, 281-288. https://doi.org/10.1109/TSMCB.2008.2002909
  26. Vert R, Vert JP, and Scholkopf B (2006). Consistency and convergence rates of one-class SVMs and related algorithms, Journal of Machine Learning Research, 7, 817-854.
  27. Wang C, Pan G, Tong T, and Zhu L (2015). Shrinkage estimation of large dimensional precision matrix using random matrix theory, Statistica Sinica, 25, 993-1008. https://doi.org/10.5705/ss.2012.328
  28. Wang H, Shao Y, Zhou S, Zhang C, and Xiu N(2021). Support vector machine classifier via L0/1 soft-margin loss, IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 7253-7265. https://doi.org/10.1109/TPAMI.2021.3092177
  29. Wang L, Zhu J, and Zou H (2006). The doubly regularized support vector machine, Statistica Sinica, 16, 589-615.
  30. Wu Y and Liu Y (2007). Robust truncated hinge loss support vector machines, Journal of the American Statistical Association, 102, 974-983. https://doi.org/10.1198/016214507000000617
  31. Zhang M, Rubio F, and Palomar DP (2012). Calibration of high-dimensional precision matrices under quadratic loss, In Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, 3365-3368.
  32. Zhang T (2004). Statistical behavior and consistency of classification methods based on convex risk minimization, The Annals of Statistics, 32, 56-85. https://doi.org/10.1214/aos/1079120130
  33. Zhang J, Li Y, Zhao N, and Zheng Z (2022). L0-regularization for high-dimensional regression with corrupted data, Communications in Statistics-Theory and Methods, 1-17.