Comparison Thai Word Sense Disambiguation Method

  • Modhiran, Teerapong (Computer Engineering Department, Faculty of Engineering King Mongkut's Institute of Technology) ;
  • Kruatrachue, Boontee (Computer Engineering Department, Faculty of Engineering King Mongkut's Institute of Technology) ;
  • Supnithi, Thepchai (Information Research and Development Division National Electronic and Computer Technology Center)
  • Published : 2004.08.25

Abstract

Word sense disambiguation is one of the most important problems in natural language processing research topics such as information retrieval and machine translation. Many approaches can be employed to resolve word ambiguity with a reasonable degree of accuracy. These strategies are: knowledge-based, corpus-based, and hybrid-based. This paper pays attention to the corpus-based strategy. The purpose of this paper is to compare three famous machine learning techniques, Snow, SVM and Naive Bayes in Word-Sense Disambiguation on Thai language. 10 ambiguous words are selected to test with word and POS features. The results show that SVM algorithm gives the best results in solving of Thai WSD and the accuracy rate is approximately 83-96%.

Keywords