Semantic Extention Search for Documents Using the Word2vec

Word2vec을 활용한 문서의 의미 확장 검색방법

  • 김우주 (연세대학교 정보산업공학과) ;
  • 김동희 (한국철도기술연구원) ;
  • 장희원 (연세대학교 정보산업공학과)
  • Received : 2016.09.27
  • Accepted : 2016.10.10
  • Published : 2016.10.28


Conventional way to search documents is keyword-based queries using vector space model, like tf-idf. Searching process of documents which is based on keywords can make some problems. it cannot recogize the difference of lexically different but semantically same words. This paper studies a scheme of document search based on document queries. In particular, it uses centrality vectors, instead of tf-idf vectors, to represent query documents, combined with the Word2vec method to capture the semantic similarity in contained words. This scheme improves the performance of document search and provides a way to find documents not only lexically, but semantically close to a query document.


Semantic Search;Document Feature Vector;Vector Space Model;Word2vec


Supported by : 한국철도기술연구원


