Variable Arrangement for Data Visualization

  • Huh, Moon Yul (Department of Statistics, Sungkyunkwan University, Seoul Korea) ;
  • Song, Kwang Ryeol (Department of Statistics, Sungkyunkwan University)
  • Published : 2001.12.01

Abstract

Some classical plots like scatterplot matrices and parallel coordinates are valuable tools for data visualization. These tools are extensively used in the modern data mining softwares to explore the inherent data structure, and hence to visually classify or cluster the database into appropriate groups. However, the interpretation of these plots are very sensitive to the arrangement of variables. In this work, we introduce two methods to arrange the variables for data visualization. First method is based on the work of Wegman (1999), and this is to arrange the variables using minimum distance among all the pairwise permutation of the variables. Second method is using the idea of principal components. We Investigate the effectiveness of these methods with parallel coordinates using real data sets, and show that each of the two proposed methods has its own strength from different aspects respectively.

Keywords

References

  1. similarity Clustering of Dimensions for and Enhanced Visulization of Multidiemsnional Data, Visualization v.99 Ankerst,M.;Berchtold,S.;Keim,D.
  2. Statistical Science v.2 Dynamic graphics for Data Analysis Becker,Richard A.;Cleveland,Cleveland,William S.;Wilks,Allan R.
  3. IEEE Trans, on Evolutionary Computation v.1 no.1 Ant Colony System: A Cooperative Learning Approach to the Travelling Salesman Proble Dorgio,M.;Gambardella,I.M.
  4. Annual Eugenics v.7 The Use of Multiple Measurerrents in Taxonomic Problems Fisher,R.A.
  5. Finding Groups in Data kaufman,l.;Rousseuw,P.
  6. kensington
  7. LISP-STAT Tierney,Luke
  8. UCI
  9. Journal of American Statistical society v.85 no.411 Hyperdimensional Data Analysis Using Parallel Coordinate Wegman,Edward J.
  10. Computing Science and Statistics: Proceedings of the 28th Symposium on the Interface High Dimensional Clustering Using Parallel Coordinates and the Grand Tour Wegman,Edward J;Luo Qiang;Lynne Billard(ed.);Nicholas Fisher(ed.)
  11. ISI 99, Proceedings Book 3 Data Mining and Visualization, Bulletin of the International Statistical Institute Wegman,Edward J.