A Study on Shape Variability in Canonical Correlation Biplot with Missing Values

Title & Authors
A Study on Shape Variability in Canonical Correlation Biplot with Missing Values
Hong, Hyun-Uk; Choi, Yong-Seok; Shin, Sang-Min; Ka, Chang-Wan;

Abstract
Canonical correlation biplot is a useful biplot for giving a graphical description of the data matrix which consists of the association between two sets of variables, for detecting patterns and displaying results found by more formal methods of analysis. Nevertheless, when some values are missing in data, most biplots are not directly applicable. To solve this problem, we estimate the missing data using the median, mean, EM algorithm and MCMC imputation methods according to missing rates. Even though we estimate the missing values of biplot of incomplete data, we have different shapes of biplots according to the imputation methods and missing rates. Therefore we use a RMS(root mean square) which was proposed by Shin et al. (2007) and PS(procrustes statistic) for measuring and comparing the shape variability between the original biplots and the estimated biplots.
Keywords
Canonical correlation biplot, shape variability;procrustes;missing mechanism;imputation methods;
Language
Korean
Cited by
1.
편정준상관 행렬도,염아림;최용석;

응용통계연구, 2011. vol.24. 3, pp.559-566
2.
공변량요인 효과를 제거한 편정준상관 행렬도와 프로크러스티즈 분석을 응용한 남자 테니스선수의 체력요인 및 기초기술요인에 대한 분석연구,최태훈;최용석;

Communications for Statistical Applications and Methods, 2012. vol.19. 1, pp.97-105
1.
Partial Canonical Correlation Biplot, Korean Journal of Applied Statistics, 2011, 24, 3, 559
References
1.
최용석 (2006). <행렬도 분석>, 부산대학교 기초과학연구원, 부산대학교 출판부, 83-86.

2.
최용석, 현기홍 (2006). <통계적 형상분석의 이해와 응용>, 자유아카데미, 서울.

3.
최태훈, 최용석 (2008). 정준상관 행렬도와 군집분석을 응용한 KLPGA 선수의 기술과 경기성적요인에 대한 연관성 분석, <응용통계연구>, 21, 429-439.

4.
최태훈, 최용석, 신상민 (2009). 테니스 그랜드슬램대회의 선수특성요인과 경기요인에 대한 분석연구 - 정준상관 행렬도와 프로크러스티즈 분석의 응용-, <응용통계연구>, 22, 855-864.

5.
Choi, Y. S. (1991). Resistant Principal Component Analysis, Biplot and Corresponding Analysis, 고려대학교, 박사학위 논문, 서울.

6.
Gabriel, K. R. (1971). The biplot graphics display of matrices with applications to principal component analysis, Biometrika, 58, 453-467.

7.
Kim, J. G., Choi, Y. S. and Lee, N. Y. (2010a). Unbalanced ANOVA for testing shape variability in statistical shape analysis, The Korean Journal of Applied Statistics, 23, 317-323.

8.
Kim, J. G., Choi, Y. S. and Shin, S. M. (2010b). Shape variability and classification using PS, MPS and RMS in statistical shape analysis, Far East Journal of Applied Mathematics, 42, 49-60.

9.
Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data, Willey, New York.

10.
Park, M. R. and Huh, M. H. (1996). Canonical correlation biplot, Journal of the Korea Statistical Society, 3, 11-19.

11.
Rubin, D. (1987). Multiple Imputation for Nonresponse in Survey, Wiley & Sons, New York.

12.
SAS Institute Inc. (1990). SAS/STAT User's Guide, 1, 4/e, SAS Institute Inc., Cary NC.

13.
Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data, Chapman & Hall, London.

14.
Shin, S. M., Choi, Y. S. and Lee, N. Y. (2008). Comparison of shape variability in principal component biplot with missing values, The Korean Journal of Applied Statistics, 21, 1109-1116.

15.
Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distribution by data augmentation, Journal of the American Statistical Association, 82, 528-540.