• Title/Summary/Keyword: query expansion

Search Result 129, Processing Time 0.027 seconds

A Method of Chinese and Thai Cross-Lingual Query Expansion Based on Comparable Corpus

  • Tang, Peili;Zhao, Jing;Yu, Zhengtao;Wang, Zhuo;Xian, Yantuan
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.805-817
    • /
    • 2017
  • Cross-lingual query expansion is usually based on the relationship among monolingual words. Bilingual comparable corpus contains relationships among bilingual words. Therefore, this paper proposes a method based on these relationships to conduct query expansion. First, the word vectors which characterize the bilingual words are trained using Chinese and Thai bilingual comparable corpus. Then, the correlation between Chinese query words and Thai words are computed based on these word vectors, followed with selecting the Thai candidate expansion terms via the correlative value. Then, multi-group Thai query expansion sentences are built by the Thai candidate expansion words based on Chinese query sentence. Finally, we can get the optimal sentence using the Chinese and Thai query expansion method, and perform the Thai query expansion. Experiment results show that the cross-lingual query expansion method we proposed can effectively improve the accuracy of Chinese and Thai cross-language information retrieval.

XML Information Retrieval by Document Filtering and Query Expansion Based on Ontology (온톨로지 기반 문서여과 및 질의확장에 의한 XML 정보검색)

  • Kim Myung Sook;Kong Yong-Hae
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.5
    • /
    • pp.596-605
    • /
    • 2005
  • Conventional XML query methods such as simple keyword match or structural query expansion are not sufficient to catch the underlying information in the documents. Moreover, these methods inefficiently try to query all the documents. This paper proposes document tittering and query expansion methods that are based on ontology. Using ontology, we construct a universal DTD that can filter off unnecessary documents. Then, query expansion method is developed through the analysis of concept hierarchy and association among concepts. The proposed methods are applied on variety of sample XML documents to test the effectiveness.

  • PDF

Optimizing the Additional Term Weight Ratio in Query Expansion Search based on Dictionary Definition (사전 의미 기반의 질의확장 검색에서 추가 용어 가중치 최적화)

  • 최영란;전유정;박순철
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.8 no.2
    • /
    • pp.45-53
    • /
    • 2003
  • The significances of this paper are of two points. One is that this research develops the query expansion search by adding the related terms based on the dictionary to the original query terms. This method shortens the process of the conventional model of query expansion utilizing the feedback data of the search. The other is that this research tries to find out the optimal point of precisions and recalls by differentiating the weight ratio between original quay and additional terms. This method shows that the efficiency and precision of query expansion search increase.

  • PDF

Domain Centered Query Expansion Technique using Topic Model (토픽 모델을 사용한 도메인 중심 질의 확장 기술)

  • Lee, Sanghoon;Moon, Seung-Jin
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.11
    • /
    • pp.611-616
    • /
    • 2017
  • In the area of Information Retrieval, Query Expansion is a well-known technique that uses external knowledge to increase an inquiry generated by users. However, ambiguous words used in the query decrease the performance of search tools. In this paper, we propose a solution to the above problem, by using domain knowledge which identifies the meaning of words in the query. In particular, we present a domain centered query expansion technique that magnifies a query using domains. By comparing with various query expansion models, we demonstrate that the proposed model performs better than the other models.

A Brief Survey into the Field of Automatic Image Dataset Generation through Web Scraping and Query Expansion

  • Bart Dikmans;Dongwann Kang
    • Journal of Information Processing Systems
    • /
    • v.19 no.5
    • /
    • pp.602-613
    • /
    • 2023
  • High-quality image datasets are in high demand for various applications. With many online sources providing manually collected datasets, a persisting challenge is to fully automate the dataset collection process. In this study, we surveyed an automatic image dataset generation field through analyzing a collection of existing studies. Moreover, we examined fields that are closely related to automated dataset generation, such as query expansion, web scraping, and dataset quality. We assess how both noise and regional search engine differences can be addressed using an automated search query expansion focused on hypernyms, allowing for user-specific manual query expansion. Combining these aspects provides an outline of how a modern web scraping application can produce large-scale image datasets.

Efficient Query Expansion Method using Fuzzy Thesaurus in Component Retrieval (컴포넌트 검색에서 퍼지 시소러스를 이용한 효율적인 질의확장 방법)

  • 김귀정;한정수
    • The Journal of the Korea Contents Association
    • /
    • v.4 no.1
    • /
    • pp.76-82
    • /
    • 2004
  • In this paper, we used query evaluation method through thesaurus for retrieving Components having concept relation with any classes in a query. Queries are presented in boolean and expanded by similar table. Query expansion by thesaurus is the solution of the term mismatching and it enhanced precision and recall of the components retrieval. For efficiency evaluation of query expansion, we defined most critical value through a simulation and compared precision and recall each other.

  • PDF

Query Expansion Using Augmented Terms in an Extended Boolean Model

  • Nguyen, Tuan-Quang;Heo, Jun-Seok;Lee, Jung-Hoon;Kim, Yi-Reun;Whang, Kyu-Young
    • Journal of Computing Science and Engineering
    • /
    • v.2 no.1
    • /
    • pp.26-43
    • /
    • 2008
  • We propose a new query expansion method in the extended Boolean model that improves precision without degrading recall. For improving precision, our method promotes the ranks of documents having more query terms since users typically prefer such documents. The proposed method consists of the following three steps: (1) expanding the query by adding new terms related to each term of the query, (2) further expanding the query by adding augmented terms, which are conjunctions of the terms, (3) assigning a weight on each term so that augmented terms have higher weights than the other terms. We conduct extensive experiments to show the effectiveness of the proposed method. The experimental results show that the proposed method improves precision by up to 102% for the TREC-6 data compared with the existing query expansion method using a thesaurus proposed by Kwon et al.

A Study on Query Expansion Using Concept (개념을 이용한 질의 확장에 관한 연구)

  • Han Jung-Soo;Kim Gui-Jung
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.1
    • /
    • pp.135-145
    • /
    • 2005
  • Without detailed exact knowledge of a retrieval collection, most users find it difficult to formulate effective queries. In fact, most users may spend large amount of time formulating queries in order to obtain their desired result. A method to overcome this difficulty is to use query expansion that reformulates better query from initial query. In this paper we propose concept based query evaluation method using concept of class that retrieved from initial query. This concept is expanded through thesaurus. For efficiency evaluation of query expansion, we defined most critical value through a simulation and compared precision and recall each other.

  • PDF

A Query Expansion Technique using Query Patterns in QA systems (QA 시스템에서 질의 패턴을 이용한 질의 확장 기법)

  • Kim, Hea-Jung;Bu, Ki-Dong
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.1
    • /
    • pp.1-8
    • /
    • 2007
  • When confronted with a query, question answering systems endeavor to extract the most exact answers possible by determining the answer type that fits with the key terms used in the query. However, the efficacy of such systems is limited by the fact that the terms used in a query may be in a syntactic form different to that of the same words in a document. In this paper, we present an efficient semantic query expansion methodology based on query patterns in a question category concept list comprised of terms that are semantically close to terms used in a query. The proposed system first constructs a concept list for each question type and then builds the concept list for each question category using a learning algorithm. The results of the present experiments suggest the promise of the proposed method.

  • PDF

Word Embeddings-Based Pseudo Relevance Feedback Using Deep Averaging Networks for Arabic Document Retrieval

  • Farhan, Yasir Hadi;Noah, Shahrul Azman Mohd;Mohd, Masnizah;Atwan, Jaffar
    • Journal of Information Science Theory and Practice
    • /
    • v.9 no.2
    • /
    • pp.1-17
    • /
    • 2021
  • Pseudo relevance feedback (PRF) is a powerful query expansion (QE) technique that prepares queries using the top k pseudorelevant documents and choosing expansion elements. Traditional PRF frameworks have robustly handled vocabulary mismatch corresponding to user queries and pertinent documents; nevertheless, expansion elements are chosen, disregarding similarity to the original query's elements. Word embedding (WE) schemes comprise techniques of significant interest concerning QE, that falls within the information retrieval domain. Deep averaging networks (DANs) defines a framework relying on average word presence passed through multiple linear layers. The complete query is understandably represented using the average vector comprising the query terms. The vector may be employed for determining expansion elements pertinent to the entire query. In this study, we suggest a DANs-based technique that augments PRF frameworks by integrating WE similarities to facilitate Arabic information retrieval. The technique is based on the fundamental that the top pseudo-relevant document set is assessed to determine candidate element distribution and select expansion terms appropriately, considering their similarity to the average vector representing the initial query elements. The Word2Vec model is selected for executing the experiments on a standard Arabic TREC 2001/2002 set. The majority of the evaluations indicate that the PRF implementation in the present study offers a significant performance improvement compared to that of the baseline PRF frameworks.