DOI QR코드

DOI QR Code

Comparison of Shape Variability in Principal Component Biplot with Missing Values

Shin, Sang-Min;Choi, Yong-Seok;Lee, Nae-Young

  • Published : 2008.12.31

Abstract

Biplots are the multivariate analogue of scatter plots. They are useful for giving a graphical description of the data matrix, for detecting patterns and for displaying results found by more formal methods of analysis. Nevertheless, when some values are missing in data matrix, most biplots are not directly applicable. In particular, we are interested in the shape variability of principal component biplot which is the most popular in biplots with missing values. For this, we estimate the missing data using the EM algorithm and mean imputation according to missing rates. Even though we estimate missing values of biplot of incomplete data, we have different shapes of biplots according to the imputation methods and missing rates. Therefore we propose a RMS(root mean square) for measuring and comparing the shape variability between the original biplots and the estimated biplots.

Keywords

Biplots;EM algorithm;mean imputation;principal component biplot;RMS;shape variability

References

  1. Choi, Y. S., Hyun, G. H. and Yun, W. J. (2005). Biplots' variability based on the Procrustes analysis, Journal of the Korean Data Analysis Society, 7, 1925-1933
  2. Gabriel, K. R. (1971). The biplot graphics display of matrices with application to principal component analysis, Biometrika, 58, 453-467 https://doi.org/10.1093/biomet/58.3.453
  3. Dryden, I. L. and Mardia, K. V. (1998). Statistical Shape Analysis, John Wiley & Sons, Chichester
  4. Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B, 39, 1-38
  5. Johnson, R. A. and Wichern, D. W. (1998). Applied Multivariate Statistical Analysis, Prentice-Hall, New York
  6. Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data, Chapman & Hall/CRC, London
  7. Srivastava, M. S. (2002). Methods of Multivariate Statics, Wiley, New York
  8. McLachlan, G. J. and Krishnan, T. (1997). The EM Algorithm and Extensions, Wiley, New York
  9. Rubin, D. B. (1976). Inference and missing data, Biometrika, 63, 581-592 https://doi.org/10.1093/biomet/63.3.581
  10. Little, R. J. A. (1988). A test of missing completely at random for multivariate data with missing values, American Statistical Association, 83, 1198-1202 https://doi.org/10.2307/2290157
  11. Little, R. J. A. and Rubin, D. B. (1987). Statistical Analysis with Missing Data, Wiley, New York

Cited by

  1. A Study on Shape Variability in Canonical Correlation Biplot with Missing Values vol.23, pp.5, 2010, https://doi.org/10.5351/KJAS.2010.23.5.955
  2. A robust AMMI model for the analysis of genotype-by-environment data 2015, https://doi.org/10.1093/bioinformatics/btv533