DOI QR코드

DOI QR Code

An Experimental Study on an Effective Word Sense Disambiguation Model Based on Automatic Sense Tagging Using Dictionary Information

사전 정보를 이용한 단어 중의성 해소 모형에 관한 실험적 연구

  • 이용구 (연세대학교 문헌정보학과) ;
  • 정영미 (연세대학교 문헌정보학과)
  • Published : 2007.03.30

Abstract

This study presents an effective word sense disambiguation model that does not require manual sense tagging Process by automatically tagging the right sense using a machine-readable and the collocation co-occurrence-based methods. The dictionary information-based method that applied multiple feature selection showed the tagging accuracy of 70.06%, and the collocation co-occurrence-based method 56.33%. The sense classifier using the dictionary information-based tagging method showed the classification accuracy of 68.11%, and that using the collocation co-occurrence-based tagging method 62.09% The combined 1a99ing method applying data fusion technique achieved a greater performance of 76.09% resulting in the classification accuracy of 76.16%.

이 연구에서는 수작업 태깅없이 기계가독형 사전을 이용하여 자동으로 의미를 태깅한 후 학습데이터로 구축한 분류기에 대해 의미를 분류하는 단어 중의성 해소 모형을 제시하였다. 자동 태깅을 위해 사전 추출 정보 기반방법과 연어 공기 기반 방법을 적용하였다. 실험 결과, 자동 태깅에서는 복수 자질 축소를 적용한 사전 추출 정보 기반 방법이 70.06%의 태깅 정확도를 보여 연어 공기 기반 방법의 56.33% 보다 24.37% 향상된 성능을 가져왔다. 사전 추출 정보 기반 방법을 이용한 분류기의 분류 정학도는 68.11%로서 연어 공기 기반 방법의 62.09% 보다 9.7% 향상된 성능을 보였다. 또한 두 자동 태깅 방법을 결합한 결과 태깅 정확도는 76.09%, 분류 정확도는 76.16%로 나타났다.

Keywords

References

  1. 국립국어연구원. 1999. 표준국어대사전. 서울: 두산동아.
  2. 연세대학교 언어정보개발연구원. 1998. 연세한국어사전. 서울: 두산동아
  3. 정영미, 이용구. 2005. 정보검색 성능 향상을 위한 단어 중의성 해소모형에 관한 연구, 정보관리학회지, 22(2): 125-145. https://doi.org/10.3743/KOSIM.2005.22.2.125
  4. Agirre, E. and G. Rigau. 1996. "Word sense disambiguation using conceptual density." Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), 16-22.
  5. Dagan, I. and A. Itai. 1994. "Word sense disambiguation using a second language monolingual corpus." Computational Linguistics, 20(4): 563-596.
  6. Edmonds, P. and S. Cotton. 2001. "SENSEVAL-2: Overview." Proceedings of the 2nd International workshop on Evaluating Word Sense Disambiguation Systems, 1-5.
  7. Gale , W. A. 1992. "A Method for Disambiguating Word Sense in a Large Corpus." Computers and the Humanities, 26: 415-439. https://doi.org/10.1007/BF00136984
  8. Gale , W., K. W. Church, and D. Yarowsky. 1992a. "Estimating upper and lower bounds on the performance of word sense disambiguation Programs." Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, 249-256.
  9. Gale , W., K. W. Church , and D. Yarowsky. 1992b. "One sense per discourse." Proceedings of the Speech and Natural Language Workshop, 233-237.
  10. Gale , W., K. W. Church, and D. Yarowsky. 1993. "A method for disambiguating word senses in a large corpus." Computers and the Humanities, 26(56): 415-439. https://doi.org/10.1007/BF00136984
  11. Kilgarriff, A., and J. Rosenzweig,J 2000. "English Framework and Results." Computers and the Humanities, 34(1-2): 1-13. https://doi.org/10.1023/A:1002693207386
  12. Krovetz, R., and Croft, W. B. 1989. "Word sense disambiguation using machine-readable dictionaries." Proceedings of the 12th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR' 89, 27-136.
  13. Lesk, M. 1986. "Automated Sense Disambiguation Using Machine-readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone." Proceedings of the 1986 SIGDOC Conference, 24-26.
  14. Manning, C. D. and H. Schutze. 1999. Foundations of Statistical Natural Language Processing. Cambridge: MIT Press.
  15. Resnik, P. 1995. "Disambiguating Noun Groupings with Respect to WordNet Senses." Proceedings ofthe Third Workshop on Very Large Corpora, 54-68.
  16. Stevenson, M. 2003. Word Sense Disambiguation: the Case for Combinations for Knowledge Sources. California: CSLI Publications
  17. Yarowsky, D. 1992. "Word sense disambiguation using statistical models of Roget's categories trained on large corpora." Proceedings of the 14th International Conference on Computational Linguistics, COLING'92, 454-460.
  18. Yarowsky, D. 1993. "One sense per collocation." Proceeding of ARPA Human Language Technology Workshop, 266-271.
  19. Yarowsky, D. 1995. "Unsupervised word sense disambiguation rivaling supervised methods." Annual Meeting of the ACL Archive Proceedings of the 33rd conference on Association for Computational Linguistics, 189-196.

Cited by

  1. Using Query Word Senses and User Feedback to Improve Precision of Search Engine vol.26, pp.4, 2009, https://doi.org/10.3743/KOSIM.2009.26.4.081
  2. An All-Words Sense Tagging Method for Resource-Deficient Languages 2016, https://doi.org/10.1093/llc/fqw031