DOI QR코드

DOI QR Code

Principal Component Analysis with Coefficient of Variation Matrix

변동계수행렬을 이용한 주성분분석

  • Kim, Ji-Hyun (Department of Statistics and Actuarial Science, Soongsil University)
  • 김지현 (숭실대학교 정보통계보험수리학과)
  • Received : 2015.01.05
  • Accepted : 2015.03.21
  • Published : 2015.06.30

Abstract

Principal component analysis (PCA), a dimension-reduction technique, is usually implemented after the variables are standardized when the measurement unit of variables are different. To standardize a variable we divide it by its standard deviation. But there is another way to transform a variable to be independent of its measurement unit. It is to divide it by its mean rather than standard deviation. Implementing PCA on standardized variables is equivalent to implementing PCA with a correlation matrix of original variables. Similarly, implementing PCA on the transformed variables divided by their means is equivalent to implementing PCA with a matrix related to the coefficients of variation of the original variables. We explain why we need to implement PCA on the variables transformed by their means.

주성분분석은 차원축소를 위한 대표적 기법이다. 주성분분석에서 변수들이 측정단위가 다르거나 분산의 불균형이 심할 경우 흔히 변수를 표준화한 다음 분석할 것이 권장된다. 표준화 변환은 표준편차를 나누어주는 변환인데, 측정단위에 무관하게 만들기 위해서라면 평균을 나누어주는 변환도 고려해볼 수 있다. 표준화 변환을 한 다음 주성분분석하는 것은 상관행렬로 주성분분석하는 것과 같은데, 평균을 나누어주는 변환을 한 후 주성분분석하는 것은 변동계수와 관련된 행렬로 주성분분석하는 것과 같음을 보이고, 그렇게 변환을 한 다음 주성분분석을 실시하는 것이 왜 필요한가를 설명하였다.

Keywords

References

  1. Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179-188. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
  2. Hald, A. (1952). Statistical Theory with Engineering Applications. Wiley, New York.
  3. Johnson, R. A. and Wichern, D. W. (2007). Applied Multivariate Statistical Analysis. Sixth Edition. Pearson Prentice Hall.
  4. R Development Core Team (2010). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
  5. Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. Fourth Edition. Springer, New York.