DOI QR코드

DOI QR Code

Penalized logistic regression models for determining the discharge of dyspnea patients

호흡곤란 환자 퇴원 결정을 위한 벌점 로지스틱 회귀모형

  • Received : 2012.12.14
  • Accepted : 2013.01.10
  • Published : 2013.01.31

Abstract

In this paper, penalized binary logistic regression models are employed as statistical models for determining the discharge of 668 patients with a chief complaint of dyspnea based on 11 blood tests results. Specifically, the ridge model based on $L^2$ penalty and the Lasso model based on $L^1$ penalty are considered in this paper. In the comparison of prediction accuracy, our models are compared with the logistic regression models with all 11 explanatory variables and the selected variables by variable selection method. The results show that the prediction accuracy of the ridge logistic regression model is the best among 4 models based on 10-fold cross-validation.

이 논문에서는 호흡곤란을 주호소로 내원한 668명의 환자를 대상으로 11개 혈액검사 결과를 이용하여 퇴원여부를 결정하는 벌점 이항 로지스틱 회귀 기반 통계모형을 유도하였다. 구체적으로 $L^2$ 벌점에 근거한 능형 모형과 $L^1$ 벌점에 근거한 라소 모형을 고려하였다. 이 모형의 예측력 비교 대상으로는 일반 로지스틱 회귀의 11개 전체 변수를 사용한 모형과 변수선택된 모형이 사용되었다. 10-묶음 교차타당성 (10-fold cross-validation) 비교 결과 능형 모형의 예측력이 우수한 것으로 나타났다.

Keywords

References

  1. Fayyad, U. M. and Irani, K. B. (1993). Multi-interval discretization of continuous attributes as prepro¬cessing for classification learning. Proceedings of the 13th International Joint Conference on Artificial Intelligence, 1022-1027.
  2. Friedman, J., Hastie, T. and Tibshirani, R. (2008). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33, 1-22.
  3. Johnson, R. A. and Wichern, D. W. (1992). Applied multivariate statistical analysis, 3rd Ed., Prentice Hall, New Jersey.
  4. Kerber, R. (1992). ChiMerge: Discretization of numeric attribute. Proceedings of the 10th National Conference on Artificial Intelligence (AAAI-92), 123-127.
  5. Kim, J. S., Jang, Y. M. and Na, J. H. (2005) Comparison of multiway discretization algorithms for data mining. Journal of the Korean Data & Information Science Society, 16, 801-813.
  6. Lee, S., Park, J. E. and Oh, K. W. (2003) Discretization of continuous-valued attributes considering data distribution. Journal of Korea Fuzzy Logic and Intelligent Systems Society, 13, 391-396. https://doi.org/10.5391/JKIIS.2003.13.4.391
  7. McCullagh, P. and Nelder, J. A. (1989). Generalized linear models, 2nd Ed., Chapman and Hall, London.
  8. Na, J. H., Kim, J. M. and Cho, W. S. (2005). Comparison of binary discretization algorithms for data mining. Journal of the Korean Data & Information Science Society, 16, 769-780.
  9. Park, C. (2011). A quantification study of blood test results for dyspnea patients. Journal of the Korean Data & Information Science Society, 22, 477-485.
  10. Park, C., Kim, T. Y., Kwon, O. J. and Park, H. S. (2010). A simple statistical model for determining the admission or discharge of dyspnea patients. Journal of the Korean Data & Information Science Society, 21, 279-289.
  11. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society B, 21, 279-289.
  12. Whitten, D. A. and Tibshirani, R. (2011). Penalized classification using Fisher's linear discriminant. Jour-nal of the Royal Statistical Society B, 73, 753-772. https://doi.org/10.1111/j.1467-9868.2011.00783.x

Cited by

  1. Simple principal component analysis using Lasso vol.24, pp.3, 2013, https://doi.org/10.7465/jkdi.2013.24.3.533
  2. The effect of road weather factors on traffic accident - Focused on Busan area - vol.26, pp.3, 2015, https://doi.org/10.7465/jkdi.2015.26.3.661
  3. 페널티 방법을 이용한 주성분분석 연구 vol.28, pp.4, 2013, https://doi.org/10.7465/jkdi.2017.28.4.721
  4. 계수적 반응을 갖는 종양 억제 혼합물 실험에서 모형 비교 vol.28, pp.5, 2017, https://doi.org/10.7465/jkdi.2017.28.5.1021