Document Clustering Methods using Hierarchy of Document Contents

문서 내용의 계층화를 이용한 문서 비교 방법

  • 황명권 (조선대학교 컴퓨터공학부) ;
  • 배용근 (조선대학교 컴퓨터공학부) ;
  • 김판구 (조선대학교 컴퓨터공학부)
  • Published : 2006.12.30


The current web is accumulating abundant information. In particular, text based documents are a type used very easily and frequently by human. So, numerous researches are progressed to retrieve the text documents using many methods, such as probability, statistics, vector similarity, Bayesian, and so on. These researches however, could not consider both subject and semantic of documents. So, to overcome the previous problems, we propose the document similarity method for semantic retrieval of document users want. This is the core method of document clustering. This method firstly, expresses a hierarchy semantically of document content ut gives the important hierarchy domain of document to weight. With this, we could measure the similarity between documents using both the domain weight and concepts coincidence in the domain hierarchies.


