Estimating the Number of Clusters using Hotelling's Choi, Kyung-Mee;
In the cluster analysis, Hotelling's can be used to estimate the unknown number of clusters based on the idea of multiple comparison procedure. Especially, its threshold is obtained according to the probability of committing the type one error. Examples are used to compare Hotelling's with other classical location test statistics such as Sum-of-Squared Error and Wilks' The hierarchical clustering is used to reveal the underlying structure of the data. Also related criteria are reviewed in view of both the between variance and the within variance.
Multiple Comparison Procedure;Type One Error;Bonferroni-Type Significance Level;
Duda, R.D., Hart, P. E., Stork, D.G. (2001). Pattern Classification. John Wiley Sons, Inc. New York
Gallegos, M. T. (2002). Maximum likelihood clustering with outliers, Classification, Clustering, and Data Analysis(Jajuga et al Ed.), Springer
Hastie, T., Tibshirani,R., Friedman, J. (2001). The Elements of Statistical Learning, Data Mining, Irference, and Prediction. Springer
Jajuga.K., Sokolowski A., Bock H.-H. (Eds.) (2002). Classification, Clustering, and Data Analysis. Springer
Kim,D. H. and Chung, C. W. (2003). Qcluster Relevance Feedback Using Adaptive Clustering for Content-Based Image Retrieval Proceedings of the ACM SIGMOD Conference