DOI QR코드

DOI QR Code

Smoothing parameter selection in semi-supervised learning

준지도 학습의 모수 선택에 관한 연구

  • Received : 2016.05.16
  • Accepted : 2016.07.22
  • Published : 2016.07.31

Abstract

Semi-supervised learning makes it easy to use an unlabeled data in the supervised learning such as classification. Applying the semi-supervised learning on the regression analysis, we propose two methods for a better regression function estimation. The proposed methods have been assumed different marginal densities of independent variables and different smoothing parameters in unlabeled and labeled data. We shows that the overfitted pilot estimator should be used to achieve the fastest convergence rate and unlabeled data may help to improve the convergence rate with well estimated smoothing parameters. We also find the conditions of smoothing parameters to achieve optimal convergence rate.

반응 값이 없는 자료를 지도학습 (supervised learning)에 사용하는 준지도 학습 (semi-supervised learning)은 분류에 더 많은 관심을 갖는다. 본 연구는 준지도학습을 회귀분석에 적용하는 준지도 회귀함수 추정법을 제안한다. 제안된 방법은 기존의 방법과 형태는 같지만 반응 값이 있는 자료와 없는 자료의 주변분포를 다르게 가정하고, 서로 다른 평활계수를 사용하는 등 좀 더 일반화된 형태를 가진다. 제안된 추정법의 점근분포를 계산하고 점근평균제곱오차를 최소화하는 최적의 평활계수가 가지는 조건을 찾는다. 설명변수의 주변분포에 대한 추정이 잘 이루이지고, 반응 값이 있는 자료와 없는 자료의 크기에 대한 조건을 적절하게 통제할 수 있고, 그리고 평활계수가 적절하게 선택될 수 있다면 라벨없는 자료가 회귀분석에서도 도움을 줄 수 있음을 보인다. 그리고 준지도 분류에서 사용하는 것처럼 반응 값이 없는 자료의 초기추정은 작은 값을 가지는 평활계수를 사용하여 과적합 (overfitting)되도록 하는 것이 좋음을 증명한다.

Keywords

References

  1. Belkin, M., Niyogi, P. and Sindhwani, V. (2006). Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. The Journal of Machine Learning Research, 7, 2399-2434.
  2. Chapelle, O., Scholkopf, B. and Zien, A. (2006). Semi-supervised learning, MIT Press, Cambridge, MA.
  3. Cortes, C. and Mohri, M. (2007). On transductive regression. Advances in Neural Information Processing System, 19, 305-312.
  4. Liu, B., Jing, L., Yu, J. and Jia L. (2014). Constrained least squares regression for semi-supervised learning. In Advances in Knowledge Discovery and Data Mining, 8444, 110-121.
  5. Lafferty, J. and Wasserman, L. (2008). Statistical analysis of semi-supervised regression. In Advances in Neural Information Processing Systems, 20, 801-808.
  6. Nadaraya, E. A. (1964). On estimating regression. Theory of Probability and its Applications, 9, 141-142. https://doi.org/10.1137/1109020
  7. Niyogi, P. (2008). Manifold regularization and semi-supervised learning: Some theoretical analyses, Technical Report TR-2008-01, Computer science department, University of Chicago, Chicago, IL.
  8. Seok, K. (2012). Study on semi-supervised local constant regression estimation. Journal of the Korean Data & Information Science Society, 23, 579-585. https://doi.org/10.7465/jkdi.2012.23.3.579
  9. Seok, K. (2013). A study on semi-supervised kernel ridge regression estimation. Journal of the Korean Data & Information Science Society, 24, 341-353. https://doi.org/10.7465/jkdi.2013.24.2.341
  10. Seok, K. (2015). Semisupervised support vector quantile regression. Journal of the Korean Data & Information Science Society, 26, 517-524. https://doi.org/10.7465/jkdi.2015.26.2.517
  11. Suykens, J.A.K., Gastel, T. V., Bravanter, J. D., Moore, B. D. and Vandewalle, J. (2002). Least squares support vector machines, World Scientific, London.
  12. Wang, M., Hua, X., Song, Y., Dai, L. and Zhang, H. (2006). Semi-supervised kernel regression. In Proceeding of the Sixth International Conference on Data Mining, 1130-1135.
  13. Wasserman, L. (2006). All of nonparametric statistics, Springer, New York.
  14. Watson, G. S. (1964). Smooth regression analysis. Sankhya: The Indian Journal of StatisticsA, 26, 359-372.
  15. Wei, R., Pan, L. and Guo, L. (2015). Semi-supervised learning via nonnegative least squares regression. In Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, 15, 105-116.
  16. Xu, S., An. X., Qiao, X., Zhu, L. and Li, L. (2011). Semisupervised least squares support vector regression machines. Journal of Information & Computational Science, 8, 885-892.
  17. Xu, Z., King, I. and Lyu, M. R. (2010). More than semi-supervised learning, LAP LAMBERT Academic Publishing, London.
  18. Zhu, D. (2005). Semi-supervised learning literature survey, Technical Report, Computer Sciences Department, University of Wisconsin, Madison, WI.
  19. Zhu, X. and Goldberg, A. (2009). Introduction to semi-supervised learning, Morgan & Claypool, London.

Cited by

  1. 국가 감염병 공동R&D전략 수립을 위한 분류체계 및 정보서비스에 대한 연구: 해외 코로나바이러스 R&D과제의 분류모델을 중심으로 vol.26, pp.3, 2016, https://doi.org/10.13088/jiis.2020.26.3.127