Semi-supervised learning using similarity and dissimilarity

  • Seok, Kyung-Ha (Department of Data Science, Institute of Statistical Information, Inje university)
  • Received : 2010.10.22
  • Accepted : 2010.12.20
  • Published : 2011.01.31

Abstract

We propose a semi-supervised learning algorithm based on a form of regularization that incorporates similarity and dissimilarity penalty terms. Our approach uses a graph-based encoding of similarity and dissimilarity. We also present a model-selection method which employs cross-validation techniques to choose hyperparameters which affect the performance of the proposed method. Simulations using two types of dat sets demonstrate that the proposed method is promising.

Keywords

References

  1. Belkin, M., Sindhwani, V. and Niyogi, P. (2005). On manifold regularization. Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics.
  2. Belkin, M., Sindhwani, V. and Niyogi, P. (2006). Manifold regularization; A geometric framework for learning from examples, Journal of Machine Learning Research, 7, 2329-2434.
  3. Chapelle, O., Zien, A. and Scholkopf, B. (2006). Semi-supervised learning, MIT press.
  4. Goldberg A., Zhu, X. and Wright, S. (2007). Dissmilarity in graph based semi-supervised classification. Proceedings of the 10th International Conference of Artificial Intelligence and Statistics.
  5. Hwang, C. (2008). Mixed effect kernel binomial regression. Journal of Korean Data and Information Science Society, 19, 1327-1334.
  6. Hwang, H. T. (2010). Fixed size LS-SVM for multiclassification problems of large data sets. Journal of Korean Data and Information Science Society, 21, 561-567.
  7. Kimeldorf, G. S. and Wahba, G. (1971). Some results on Tchebycheffian spline functions. Journal of Mathematical Analysis and its Applications, 33, 82-95. https://doi.org/10.1016/0022-247X(71)90184-3
  8. Lafferty, J. and Wasserman, L. (2007). Statistica analysis of semi-supervised regression, in 'NIPS'.
  9. Niyogi, P. (2008). Manifold regularization and semi-supervised learning: Some throretical analysess, Tech-nical Report TR-2008-01, CS Dept, U. of Chicago.
  10. Seok, K.H. (2010). Semi-supervised classification with LS-SVM formulation. Journal of Korean Data and Information Science Society, 461-470.
  11. Shim, J., Park, H. and Hwang, C. (2009). A kernel machine for estimation of mean and volatility functions. Journal of Korean Data and Information Science Society, 20, 905-912.
  12. Sindhwani, V., Niyogi, P. and Belkin, M. (2005). Beyond the point cloud: from transductive to semisuper-vised learning. In ICML05, 22nd International Conference on Machine Learning.
  13. Singh, A., Nowak, R. and Zhu, X. (2008). Unlabeled data: Now it helps, now it doesn't, in 'NIPS'.
  14. Suykens, J.A.K. (2000). Least squares support vector machine for classification and nonlinear modeling. Neural Network World, Special Issue on PASE 2000, 10, 29-48.
  15. Vapnik, V. (1995). The nature of statistica learning theory, Springer-Verlag, New York.
  16. Zhu, X. (2005). Semi-supervised learning literature survey, Technical Report 1530, Department of Computer Sciences, University of Wisconsin, Madison.
  17. Zhu, X. and Goldberg, A. (2009). Introduction to semi-supervised learning, Morgan & Claypool.
  18. Zhu, X., Ghahramani, Z. and Lafferty, J. (2003). Semi-supervised learning using Gaussian fields and harmonic function, in 'ICML'