Recovery Levels of Clustering Algorithms Using Different Similarity Measures for Functional Data Chae, Seong San; Kim, Chansoo; Warde, William D.;
Clustering algorithms with different similarity measures are commonly used to find an optimal clustering or close to original clustering. The recovery level of using Euclidean distance and distances transformed from correlation coefficients is evaluated and compared using Rand's (1971) C statistic. The C values present how the resultant clustering is close to the original clustering. In simulation study, the recovery level is improved by applying the correlation coefficients between objects. Using the data set from Spellman et al. (1998), the recovery levels with different similarity measures are also presented. In general, the recovery level of true clusters was increased by using the correlation coefficients.
Agglomerative clustering algorithms;Correlation coefficients;Rand′s C statistic;
Journal of the Korean Statistical Society, 1991.
Proceeding of National Academy Sciences in USA, 2003.
The Canadian Journal of Statistics, 1979.
ASA Proceedings of the Social Statistics Section, 1981.
Communications in Statistics, Theory and Method, 1987.
Statistics & Probability Letters, 2004.
Proceeding of National Academy Sciences in USA, 1998.
Journal of American Statistical Association, 1983.
Journal of Biological Chemistry, 2002.