DOI QR코드

DOI QR Code

Automatic Suggestion for PubMed Query Reformulation

  • Tuan, Luu Anh (School of Computer Engineering, Nanyang Technological University) ;
  • Kim, Jung-Jae (School of Computer Engineering, Nanyang Technological University)
  • Received : 2012.02.24
  • Accepted : 2012.05.18
  • Published : 2012.06.30

Abstract

Query reformulation is an interactive process of revising user queries according to the query results. To assist biomedical researchers in this process, we present novel methods for automatically generating query reformulation suggestions. While previous work on query reformulation focused on addition of words to user queries, our method can deal with three types of query reformulation (i.e., addition, removal and replacement). The accuracy of the method for the addition type is ten times better than PubMed's "Also try", while the execution time is short enough for practical use.

Keywords

References

  1. R. Islamaj-Dogan, G. C. Murray, A. Neveol, and Z. Lu, "Understanding PubMed user search behavior through log analysis," Database: the Journal of Biological Databases and Curation, vol. 2009, article id. bap018, 2009. doi: 10.1093/database/bap018.
  2. A. Spink, B. J. Jansen, and H. C. Ozmultu, "Use of query reformulation and relevance feedback by excite users," Internet Research, vol. 10, no. 4, pp. 317-328, 2000. https://doi.org/10.1108/10662240010342621
  3. P. Wang, M. W. Berry, and Y. Yang, "Mining longitudinal web queries: trends and patterns," Journal of the American Society for Information Science and Technology, vol. 54, no. 8, pp. 743-758, 2003. https://doi.org/10.1002/asi.10262
  4. D. Beeferman and A. Berger, "Agglomerative clustering of a search engine query log," Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, 2000, pp. 407-416.
  5. Z. Lu, W. J. Wilbur, J. R. McEntyre, A. Iskhakov, L. Szilagyi, "Finding query suggestions for PubMed," American Medical Informatics Association (AMIA) Annual Symposium Proceedings, 2009, pp. 396-400.
  6. D. Johnson, V. Malhotra, and P. Vamplew, "More effective web search using bigrams and trigrams," Webology, vol. 3, no. 4, article 35, 2006.
  7. R. Kraft and J. Zien, "Mining anchor text for query refinement," Proceedings of the 13th International Conference on World Wide Web, New York, NY, 2004, pp. 666-674.
  8. J. Akahani, K. Hiramatsu, and T. Satoh, "Approximate query reformulation based on hierarchical ontology mapping," Proceedings of International Workshop on Semantic Web Foundations and Application Technologies, Nara, Japan, 2003.
  9. E. Meij and M. de Rijke, "Thesaurus-based feedback to support mixed search and browsing environments," Proceedings of the 11th European Conference on Digital Libraries, Budapest, Hungary, 2007, pp. 247-258.
  10. J. J. Rocchio, "Relevance feedback in information retrieval," The SMART Retrieval System: Experiments in Automatic Document Processing, G. Salton, ed., Englewood Cliffs, NJ: Prentice-Hall, 1971. pp. 313-323.
  11. V. Lavrenko and W. B. Croft, "Relevance based language models," Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, LA, 2001, pp. 120-127.
  12. C. Zhai and J. Lafferty, "Model-based feedback in the language modeling approach to information retrieval," Proceedings of the 10th International Conference on Information and Knowledge Management, Atlanta, GA, 2001, pp. 403-410.
  13. J. R. Herskovic, L. Y. Tanaka, W. Hersh, E. V. Bernstam, A day in the life of PubMed: analysis of a typical day's query log," Journal of the American Medical Informatics Association, vol. 14, no. 2, pp. 212-220, 2007. https://doi.org/10.1197/jamia.M2191
  14. I. Rish, "An empirical study of the Naive Bayes classifier," Proceedings of the 17th International Joint Conference on Artificial Intelligence, Seattle, WA, 2001, pp. 41-46.
  15. A. McCallum, D. Freitag, and F. C. N. Pereira, "Maximum entropy Markov models for information extraction and segmentation," Proceedings of the 17th International Conference on Machine Learning, Stanford, CA, 2000, pp. 591-598.
  16. C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, no. 3, pp. 273-297, 1995.
  17. B. J. Jansen, D. L. Booth, and A. Spink, "Determining the informational, navigational, and transactional intent of Web queries," Information Processing and Management, vol. 44, no. 3, pp. 1251-1266, 2008. https://doi.org/10.1016/j.ipm.2007.07.015
  18. S. Y. Rieh and H. Xie, "Analysis of multiple query reformulations on the web: the interactive information retrieval context," Information Processing and Management, vol. 42, no. 3, pp. 751-768, 2006. https://doi.org/10.1016/j.ipm.2005.05.005
  19. C. Liu, J. Gwizdka, J. Liu, T. Xu, and N. J. Belkin, "Analysis and evaluation of query reformulations in different task types," Proceedings of the American Society for Information Science and Technology, Pittsburgh, PA, 2010.
  20. M. C. Burton and J. B. Walther, "The value of web log data in use-based design and testing," Journal of Computer-Mediated Communication, vol. 6, no. 3, 2001.
  21. N. Kirtsis and S. Stamou, "Query reformulation for task-oriented web searches," Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Lyon, France, 2011, pp. 289-292.

Cited by

  1. LAILAPS-QSM: A RESTful API and JAVA library for semantic query suggestions vol.14, pp.3, 2018, https://doi.org/10.1371/journal.pcbi.1006058