An Improved Combined Content-similarity Approach for Optimizing Web Query Disambiguation

Kamal, Shahid;Ibrahim, Roliana;Ghani, Imran

  • Received : 2015.09.04
  • Accepted : 2015.11.17
  • Published : 2015.12.31


The web search engines are exposed to the issue of uncertainty because of ambiguous queries, being input for retrieving the accurate results. Ambiguous queries constitute a significant fraction of such instances and pose real challenges to web search engines. Moreover, web search has created an interest for the researchers to deal with search by considering context in terms of location perspective. Our proposed disambiguation approach is designed to improve user experience by using context in terms of location relevance with the document relevance. The aim is that providing the user a comprehensive location perspective of a topic is informative than retrieving a result that only contains temporal or context information. The capacity to use this information in a location manner can be, from a user perspective, potentially useful for several tasks, including user query understanding or clustering based on location. In order to carry out the approach, we developed a Java based prototype to derive the contextual information from the web results based on the queries from the well-known datasets. Among those results, queries are further classified in order to perform search in a broad way. After the result provision to users and the selection made by them, feedback is recorded implicitly to improve the web search based on contextual information. The experiment results demonstrate the outstanding performance of our approach in terms of precision 75%, accuracy 73%; recall 81% and f-measure 78% when compared with generic temporal evaluation approach and furthermore achieved precision 86%, accuracy 71%; recall 67% and f-measure 75% when compared with web document clustering approach.


Content similarity;query disambiguation;web search;location;temporal information


  1. Anastasiu, D.C., et al., "A novel two-box search paradigm for query disambiguation," World Wide Web, 16(1), pp.1-29, 2013.,
  2. Chowdhury, A.R. and G.S. Pass, Query disambiguation, Google Patents, 2014.
  3. Carpineto, C., et al., "A survey of web clustering engines," ACM Computing Surveys (CSUR), 41(3), pp.17, 2009.
  4. Joho, H., A. Jatowt, and B. Roi. "A survey of temporal web search experience," in Proceedings of the 22nd international conference on World Wide Web companion, International World Wide Web Conferences Steering Committee, 2013.
  5. Dey, A.K., "Understanding and Using Context," Personal and Ubiquitous Computing, 5(1), pp. 4-7, 2001.
  6. Campos, R., Google Insights for Search Query Classification dataset (GISQC_DS), 2011.
  7. Carpineto C. and R. G., Ambient dataset, 2008.
  8. Campos, R.N.T., "Disambiguating implicit temporal queries for temporal information retrieval applications," Universidade do Porto, 2013.
  9. Brin, S. and L. Page, "The anatomy of a large-scale hypertextual Web search engine," Computer networks and ISDN systems, 30(1),pp. 107-117, 1998.
  10. Kleinberg, J.M., "Authoritative sources in a hyperlinked environment,"Journal of the ACM (JACM), 46(5),pp. 604-632, 1999.
  11. Boston, C., et al., "Wikimantic: Toward effective disambiguation and expansion of queries," Data & Knowledge Engineering, 90, pp. 22-37, 2014.
  12. Ferragina, P. and U. Scaiella. "Tagme: on-the-fly annotation of short text fragments (by wikipedia entities),"in Proceedings of the 19th ACM international conference on Information and knowledge management. ACM, 2010.
  13. Alonso, O., M. Gertz, and R. Baeza-Yates. "Clustering and exploring search results using timeline constructions,"in Proceedings of the 18th ACM conference on Information and knowledge management. ACM, 2009.
  14. Campos, R., et al. "Disambiguating Implicit Temporal Queries by Clustering Top Relevant Dates in Web Snippets,"in Web Intelligence and Intelligent Agent Technology (WI-IAT), IEEE/WIC/ACM International Conferences on 2012. Macau: IEEE, 2012.
  15. Loia, V., et al., "Interactive knowledge management for agent-assisted web navigation," International Journal of Intelligent Systems,22(10),pp. 1101-1122, 2007.
  16. Yu, J. and M. Jeon,"A context-aware intelligent recommender system in ubiquitous environment," in 10th IASTED international conference on artificial intelligence and applications,pp.229-234, 2010.
  17. Richardson, M., E. Dominowska, and R. Ragno,"Predicting clicks: estimating the click-through rate for new ads," in Proceedings of the 16th international conference on World Wide Web, ACM, 2007.
  18. Chai, X., et al.,"Efficiently incorporating user feedback into information extraction and integration programs," in Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, ACM, 2009.
  19. Song, R., et al.,"Identification of Ambiguous Queries in Web Search," Information Processing & Management,45(2), pp. 216-229, 2006.
  20. Song, R., et al.,"Identifying ambiguous queries in web search," in Proceedings of the 16th international conference on World Wide Web, ACM, 2007.
  21. Campos, R., A. Jorge, and G. Dias,"Using web snippets and query-logs to measure implicit temporal intents in queries," in SIGIR 2011 Workshop on Query Representation and Understanding, University of Massachusetts Amherst, 2011.
  22. Campos, R., et al., "GTE-Cluster: A temporal search interface for implicit temporal queries, in Advances in Information Retrieval," Springer International Publishing: Switzerland,pp. 775-779, 2014.
  23. Cobos, C., et al., "Clustering of web search results based on the cuckoo search algorithm and Balanced Bayesian Information Criterion," Information Sciences, 281,pp. 248-264, 2014.
  24. Xue, G.-R., et al.,"Optimizing web search using web click-through data," in Proceedings of the thirteenth ACM international conference on Information and knowledge management, ACM, 2004.
  25. Li, Y., et al., "Name disambiguation in scientific cooperation network by exploiting user feedback,"Artificial Intelligence Review, 41(4),pp. 563-578, 2014.