DOI QR코드

DOI QR Code

A polychotomous regression model with tensor product splines and direct sums

연속형의 텐서곱과 범주형의 직합을 사용한 다항 로지스틱 회귀모형

  • Sim, Songyong (Department of Finance & Information Statistics, Hallym University) ;
  • Kang, Heemo (Department of Finance & Information Statistics, Hallym University)
  • 심송용 (한림대학교 금융정보통계학과) ;
  • 강희모 (한림대학교 금융정보통계학과)
  • Received : 2013.09.30
  • Accepted : 2013.11.11
  • Published : 2014.01.31

Abstract

In this paper, we propose a polychotomous regression model when independent variables include both categorical and numerical variables. For categorical independent variables, we use direct sums, and tensor product splines are used for continuous independent variables. We use BIC for varible selections criterior. We implemented the algorithm and apply the algorithm to real data. The use of direct sums and tensor products outperformed the usual multinomial logistic regression model.

다항 로지스틱 회귀모형의 설명변수가 연속형과 범주형을 모두 포함할 때 범주형 설명변수는 직합을 적용하고 연속형 설명변수는 텐서곱을 적용하는 모형을 제안한다. 변수선택의 기준으로 BIC를 사용하고, 제안된 모형의 알고리즘을 구현하였다. 구현된 알고리즘을 실제 자료에 적용하여 기존의 방법과 비교하여 제안된 모형이 더 좋은 분류율을 보임을 확인하였다.

Keywords

References

  1. Agarwal, G. G. and Studden, W. J. (1980). Asymptotic integrated mean square error using least squares and bias minimizing spline. The Annals of Statistics, 8, 1307-1325. https://doi.org/10.1214/aos/1176345203
  2. Arppe, A. (2012). polytomous: Polytomous logistic regression for fixed and mixed effects. R package version 0.1.4., http://CRAN.R-project.org/package=polytomous.
  3. Choi, S. and Park, C. (2013). An educational tool for regression models with dummy variables using Excel VBA. Journal of the Korean Data & Information Science Society, 24, 593-601. https://doi.org/10.7465/jkdi.2013.24.3.593
  4. Friedman, J. H. (1991). Multivariate adaptive regression splines (with discussion). The Annals of Statistics, 19, 1-141. https://doi.org/10.1214/aos/1176347963
  5. Friedman, J. H. and Silverman, B. W. (1989). Flexible parsimonious smoothing and additive modeling (with discussion). Technometrics, 31, 3-39. https://doi.org/10.1080/00401706.1989.10488470
  6. Hastie, T. J. and Tibshirani, R. J. (1990). Generalized additive models, Chapman and Hall, London.
  7. Kahng, M. (2011). A study on log-density ratio in logistic regression model for binary data. Journal of the Korean Data & Information Science Society, 22, 107-113.
  8. Kahng, M. and Shin, E. (2012). A study on log-density with log-odds graph for variable selection in logistic regression. Journal of the Korean Data & Information Science Society, 23, 99-111. https://doi.org/10.7465/jkdi.2012.23.1.099
  9. Kahng, M., Kim, B. and Hong, J. (2010). Graphical regression and model assessment in logistic model. Journal of the Korean Data & Information Science Society, 21, 21-32.
  10. Koo, J. and Lee, Y. (1994). Bivariate B-splines in generalized linear models. Journal of Statistical Computation and Simulation, 50, 119-129. https://doi.org/10.1080/00949659408811603
  11. Kooperberg, C. (2013). polspline: Polynomial spline routines. R package version 1.1.8., http://CRAN.R-project.org/package=polspline.
  12. Kooperberg, C., Bose, S. and Stone, J. (1997). Polychotomous regression. Journal of the American Statistical Association, 92, 117-127. https://doi.org/10.1080/01621459.1997.10473608
  13. Lee, S., Sim, S. and Koo, J. (2004). A study on data mining using the spline basis. Communications of the Korean Statistical Society, 11, 255-264. https://doi.org/10.5351/CKSS.2004.11.2.255
  14. McCullagh, P. and Nelder, J. A. (1989). Generalized linear models, 2nd ed., Chapman and Hall, London.
  15. Priestley, M. B. (1981). Spectral analysis and time series, Academic Press, London.
  16. Shim, J. and Seok, K. (2012). Semiparametric kernel logistic regression with longitudinal data. Journal of the Korean Data & Information Science Society, 23, 385-392. https://doi.org/10.7465/jkdi.2012.23.2.385
  17. Shim, J. and Seok, K. (2013). GACV for partially linear support vector regression. Journal of the Korean Data & Information Science Society, 24, 391-399. https://doi.org/10.7465/jkdi.2013.24.2.391
  18. Stone, C. J. (1994). The use of polynomial splines and their products in multivariate function estimation. The Annals of Statistics, 22, 118-171. https://doi.org/10.1214/aos/1176325361

Cited by

  1. Estimation for misclassified data with ultra-high levels vol.27, pp.1, 2016, https://doi.org/10.7465/jkdi.2016.27.1.217
  2. 기계학습을 활용한 도로비탈면관리시스템 데이터 품질강화에 관한 연구 vol.31, pp.1, 2021, https://doi.org/10.9720/kseg.2021.1.031