DOI QR코드

DOI QR Code

회귀모형에서 이상치 검색을 이용한 로버스트 변수변환방법

Robust Response Transformation Using Outlier Detection in Regression Model

  • 투고 : 20111000
  • 심사 : 20111100
  • 발행 : 2012.02.29

초록

선형회귀모형에서 자료를 모형에 적합시킬 때 일반적으로 반응변수 변환을 시도하지만 적절한 변환함수의 결정은 몇개의 이상치들에 민감하게 반응한다는 것이 잘 알려져 있다. 이에 따라 이상치에 영향을 받지 않는 변수변환 방법들이 연구, 개발되고 있으나 최근에 Cheng (2005)에 의해 최소절사제곱추정치에 기반을 둔 절사 우도추정치 방법처럼 이상치의 숫자를 미리 정해야한다거나 많은 계산량이 필요하다는 단점들을 갖고 있다. 본 논문에서는 그와 같은 문제점을 해결하고 추정치의 강건성을 개선하는 새로운 방법을 제안하며 제안된 방법에서는 반응변수 변환에 따른 이상치 탐색법에 있어서 Hadi와 Simonoff (1993)가 제시한 단계적 절차를 응용, 적용한다.

키워드

박스-콕스 변환;변수변환;이상치;최소절사제곱추정량;회귀모형

참고문헌

  1. Atkinson, A. C. (1985). Plots, Transformations and Regression: An Introduction to Graphical Method of Diagnostic Regression Analysis, Oxford University Press, Oxford.
  2. Atkinson, A. C. (1986). Aspects of diagnostic regression analysis (discussion of influential observations, high leverage points, and outliers in linear regression), Statistical Science, 1, 397-402. https://doi.org/10.1214/ss/1177013624
  3. Atkinson, A. C. (1988). Transformations unmasked, Technometrics, 30, 311-318. https://doi.org/10.2307/1270085
  4. Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations (with discussion), Journal of the Royal Statistical Society. Series B (Methodological), 26, 211-246.
  5. Cheng, T.-C. (2005). Robust regression diagnostics with data transformations, Computational Statistics & Data Analysis, 49, 875-891. https://doi.org/10.1016/j.csda.2004.06.010
  6. Cook, R. D. and Wang, P. C. (1983). Transformations and influential cases in regression, Technometrics, 25, 337-343. https://doi.org/10.2307/1267855
  7. Gentleman, J. F. and Wilk, M. B. (1975). Detecting outliers. II. Supplementing The direct analysis of residuals, Biometrics, 31, 387-410. https://doi.org/10.2307/2529428
  8. Hadi, A. S. and Luceno, A. (1997). Maximum trimmed likelihood estimators: A unified approach, examples, and algorithms, Computational Statistics & Data Analysis, 25, 251-272. https://doi.org/10.1016/S0167-9473(97)00011-X
  9. Hadi, A. S. and Simonoff, J. S. (1993). Procedures for the identification of multiple outliers in linear models, Journal of the American Statistical Association, 88, 1264-1272. https://doi.org/10.2307/2291266
  10. Hinkley, D. V. and Wang, S. (1988). More about transformations and influential cases in regression, Technometrics, 30, 435-440. https://doi.org/10.2307/1269807
  11. Kianifard, F. and Swallow, W. H. (1989). Using recursive residuals, calculated on adaptively-ordered observations, to identify outliers in linear regression, Biometrics, 45, 571-585. https://doi.org/10.2307/2531498
  12. Marasinghe, M. G. (1985). A multistage procedure for detecting several outliers in linear regression, Technometrics, 27, 395-399. https://doi.org/10.2307/1270206
  13. Paul, S. R. and Fung, K. Y. (1991). A generalized extreme studentized residual multiple-outlier-detection procedure in linear regression, Technometrics, 33, 339-348. https://doi.org/10.2307/1268785
  14. Rousseeuw, P. J. (1984). Least median of squares regression, Journal of the American Statistical Association, 79, 871-880. https://doi.org/10.2307/2288718
  15. Rousseeuw, P. J. and Driessen, K. V. (2006). Computing LTS regression for large data sets, Data Mining and Knowledge Discovery, 12, 29-45. https://doi.org/10.1007/s10618-005-0024-4
  16. Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection, John Wiley, New York.
  17. Tsai, C. L. and Wu, X. (1990). Diagnostics in transformation and weighted regression, Technometrics, 32, 315-322. https://doi.org/10.2307/1269108