JOURNAL BROWSE
Search
Advanced SearchSearch Tips
A Study on a Statistical Matching Method Using Clustering for Data Enrichment
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
A Study on a Statistical Matching Method Using Clustering for Data Enrichment
Kim Soon Y.; Lee Ki H.; Chung Sung S.;
  PDF(new window)
 Abstract
Data fusion is defined as the process of combining data and information from different sources for the effectiveness of the usage of useful information contents. In this paper, we propose a data fusion algorithm using k-means clustering method for data enrichment to improve data quality in knowledge discovery in database(KDD) process. An empirical study was conducted to compare the proposed data fusion technique with the existing techniques and shows that the newly proposed clustering data fusion technique has low MSE in continuous fusion variables.
 Keywords
Clustering;Data enrichment;Data fusion Data Mining;k-Nearest Neighbor;Statistical matching;
 Language
Korean
 Cited by
1.
의사결정 규칙을 이용한 데이터 통합에 관한 연구,김순영;정성석;

응용통계연구, 2006. vol.19. 2, pp.291-303 crossref(new window)
 References
1.
정성석, 김순영, 김현진 (2004). 데이터 보강을 위한 데이터 통합기법에 관한 연구, '응용통계연구', 제17권, 605-617

2.
Blake, C. L. and Merz, C. J. (1998). UCI Repository of machine learning databases [http://www.ics.uci.edu/-mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science

3.
Ingram, D., O'Hare, J., Scheuren, F. and Turek, J (2000). Statistical matching: a new validation case study. Proceedings of the Survey Research Methods Section, American Statistical Association

4.
Rassler, S. (2002). Statistical Matching : A frequentist theory, practical applications, and alternative Bayesian approaches. New York, Springer Verlag

5.
Saporta, G. (2002). Data fusion and data grafting, Computational Statistics & Data Analysis 38 465-473 crossref(new window)

6.
U.S. Department of Commerce, (1980). Report on exact and statistical matching techniques. Statistical Policy Working Paper 5. Washington, DC: Federal Committee on Statistical Methodology

7.
van der Putten, P., Joost N. K. and Gupta, A. (2002). Why the Information Explosion Can Be Bad for Data Mining, and How Data Fusion Provides a Way Out, Second SIAM International Conference on Data Mining, Arlington, April 11-13

8.
Yoshizoe, Y. and Araki, M. (1999). Use of statistical matching for household surveys In Japan. In 52nd Session of the International Statistical Institute, Helsinki, Finland