Answer Snippet Retrieval for Question Answering of Medical Documents

Lee, Hyeon-gu;Kim, Minkyoung;Kim, Harksoo;

doi:10.5626/JOK.2016.43.8.927

Journal of KIISE (정보과학회 논문지)

Volume 43 Issue 8
/
Pages.927-932
/
2016
/
2383-630X(pISSN)
/
2383-6296(eISSN)

Korean Institute of Information Scientists and Engineers (한국정보과학회)

DOI QR Code

Answer Snippet Retrieval for Question Answering of Medical Documents

의학문서 질의응답을 위한 정답 스닛핏 검색

Lee, Hyeon-gu ;
Kim, Minkyoung ;
Kim, Harksoo (Kangwon National Univ.)

이현구 (강원대학교 컴퓨터정보통신공학과) ;
김민경 (강원대학교 컴퓨터정보통신공학과) ;
김학수 (강원대학교 컴퓨터정보통신공학과)

Received : 2016.02.04
Accepted : 2016.05.08
Published : 2016.08.15

https://doi.org/10.5626/JOK.2016.43.8.927 Citation KSCI

⟨ Previous Next ⟩

Abstract

With the explosive increase in the number of online medical documents, the demand for question-answering systems is increasing. Recently, question-answering models based on machine learning have shown high performances in various domains. However, many question-answering models within the medical domain are still based on information retrieval techniques because of sparseness of training data. Based on various information retrieval techniques, we propose an answer snippet retrieval model for question-answering systems of medical documents. The proposed model first searches candidate answer sentences from medical documents using a cluster-based retrieval technique. Then, it generates reliable answer snippets using a re-ranking model of the candidate answer sentences based on various sentence retrieval techniques. In the experiments with BioASQ 4b, the proposed model showed better performances (MAP of 0.0604) than the previous models.

온라인 의학 문서의 폭발적 증가와 함께 질의응답 시스템에 대한 필요성이 늘어나고 있다. 최근에는 기계학습에 기반 한 질의응답 모델들이 다양한 영역에서 좋은 결과를 보여 왔다. 그러나 의학 영역에서 질의응답 모델들은 학습 데이터의 부족으로 인해 여전히 정보 검색 기술에 기반을 두고 있다. 본 논문에서는 다양한 정보검색 기술에 기반 한 의학문서 질의응답용 정답 스닛핏 검색 모델을 제안한다. 제안 모델은 먼저 클러스터 기반 검색 기술을 이용하여 의학 문서로부터 많은 정답 후보 문장을 검색한다. 그리고 다양한 문장 검색 기술들에 기반 한 정답 후보 문장 재순위화 모델을 사용하여 신뢰성 있는 정답 스닛핏을 생성한다. BioASQ 4b 데이터를 이용한 실험에서 제안 모델은 기존 모델보다 좋은 성능(MAP 0.0604)을 보였다.

Keywords

Acknowledgement

Grant : 링크드데이터 기반 대화형 질의응답 검색 프레임워크 개발

Supported by : LG전자

References

Nedellec, Claire, et al., "Overview of BioNLP shared task 2013," Proc. of the BioNLP Shared Task 2013 Workshop, pp. 1-7, 2013.
IBMWatson and Medical Records Text Analytics HIMSS Presentation [Online]. Available: http://www-01.ibm.com/software/ebusiness/jstart/downloads/MRTAWatsonHIMSS.pdf (downloaded 2015 Nov, 1)
Balikas, Georgios, et al., "Results of the BioASQ tasks of the Question Answering Lab at CLEF 2015," CLEF 2015, 2015.
Aronson, Alan R., and Thomas C. Rindflesch, "Query expansion using the UMLS Metathesaurus," Proc. of the AMIA Annual Fall Symposium, American Medical Informatics Association, 1997.
Ben Abacha, Asma, and Pierre Zweigenbaum, "Medical question answering: translating medical questions into sparql queries," Proc. of the 2nd ACM SIGHIT International Health Informatics Symposium, ACM, 2012.
Yu, Lei, et al., "Deep learning for answer sentence selection," arXiv preprint arXiv:1412.1632, 2014.
Bordes, Antoine, Sumit Chopra, and Jason Weston, "Question answering with subgraph embeddings," arXiv preprint arXiv:1406.3676, 2014.
Ravichandran, Deepak, and Eduard Hovy, "Learning surface text patterns for a question answering system," Proc. of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, pp. 41-47, 2002.
Neves, Mariana, "HPI question answering system in the BioASQ 2015 challenge," Working Notes for the Conference and Labs of the Evaluation Forum (CLEF), Toulouse, France, 2015.
Yenala, Harish, et al., "IIITH at BioASQ Challange 2015 Task 3b: Bio-Medical Question Answering System," Toulouse, France, 2015.
Zhang, Zhi-Juan, et al., "A generic retrieval system for biomedical literatures: USTB at BioASQ2015 Question Answering Task," Working Notes for the Conference and Labs of the Evaluation Forum (CLEF), Toulouse, France, 2015.
Peng, Shengwen, et al., "The Fudan participation in the 2015 BioASQ Challenge: Large-scale Biomedical Semantic Indexing and Question Answering," Working Notes for the Conference and Labs of the Evaluation Forum (CLEF), Toulouse, France. 2015.
Song, Fei, and W. Bruce Croft, "A general language model for information retrieval," Proc. of the eighth international conference on Information and knowledge management, ACM, pp. 316-321, 1999.
Merkel, Andreas, and Dietrich Klakow, "Comparing improved language models for sentence retrieval in question answering," LOT Occasional Series 7, pp. 35-50, 2007.
Bodenreider, Olivier, "The unified medical language system (UMLS): integrating biomedical terminology," Nucleic acids research 32. suppl 1 : D267-D270, 2004. https://doi.org/10.1093/nar/gkh061
Aronson, Alan R., "Metamap: Mapping text to the umls metathesaurus," Bethesda, MD: NLM, NIH, DHHS, pp. 1-26, 2006.
Lafferty, John, Andrew McCallum, and Fernando CN Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," Proc. of the 18th International Conference on Machine Learning 2001 (ICML 2001), pp. 282-289, 2001.
Robertson, Stephen E., et al., "Okapi at TREC-7: automatic ad hoc, filtering, VLC and interactive track," Nist Special Publication SP, pp. 253-264, 1999.
Blanco, Roi, and Hugo Zaragoza, "Finding support sentences for entities," Proc. of the 33rd international ACM SIGIR conference on Research and development in information retrieval, ACM, 2010.
C D. Paice, "Soft evaluation of Boolean search queries in information retrieval systems," Information Technology Research Development Applications, Vol. 3, pp. 33-41, 1984.
Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schutze, "Scoring, term weighting and the vector space model," Introduction to Information Retrieval 100, 2008.
BioASQ-Task B 3b Training Data [Online]. Available: http://participants-area.bioasq.org/general_information/Task3b/ (downloaded 2015, Mar. 1)
BioASQ-Task B 4b Batch2 Data [Online]. Available: http://participants-area.bioasq.org/Tasks/4b/ (downloaded 2016, Mar. 24)
BioASQ-EvalMeasures-taskB [Online]. Available: http://participants-area.bioasq.org/oracle/results/taskB/phaseB/ (downloaded 2015, Mar. 1)

Journal of KIISE (정보과학회 논문지)

Answer Snippet Retrieval for Question Answering of Medical Documents

의학문서 질의응답을 위한 정답 스닛핏 검색

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)