JOURNAL BROWSE
Search
Advanced SearchSearch Tips
Cloning of Korean Morphological Analyzers using Pre-analyzed Eojeol Dictionary and Syllable-based Probabilistic Model
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Cloning of Korean Morphological Analyzers using Pre-analyzed Eojeol Dictionary and Syllable-based Probabilistic Model
Shim, Kwangseob;
 
 Abstract
In this study, we verified the feasibility of a Korean morphological analyzer that uses a pre-analyzed Eojeol dictionary and syllable-based probabilistic model. For the verification, MACH and KLT2000, Korean morphological analyzers, were cloned with a pre-analyzed eojeol dictionary and syllable-based probabilistic model. The analysis results were compared between the cloned morphological analyzer, MACH, and KLT2000. The 10 million Eojeol Sejong corpus was segmented into 10 sets for cross-validation. The 10-fold cross-validated precision and recall for cloned MACH and KLT2000 were 97.16%, 98.31% and 96.80%, 99.03%, respectively. Analysis speed of a cloned MACH was 308,000 Eojeols per second, and the speed of a cloned KLT2000 was 436,000 Eojeols per second. The experimental results indicated that a Korean morphological analyzer that uses a pre-analyzed eojeol dictionary and syllable-based probabilistic model could be used in practical applications.
 Keywords
Korean morphology;morphological analysis;probabilistic model;pre-analyzed dictionary;Sejong corpus;
 Language
Korean
 Cited by
 References
1.
Jae Sung Lee, "Three-Step Probabilistic Model for Korean Morphological Analysis," Journal of KIISE : Software and Applications, Vol. 38, No. 5, pp. 257-268, 2011. (in Korean)

2.
Seung Hyun Yang and Young-Sum Kim, "A High- Speed Korean Morphological Analysis Method based on Pre-Analyzed Partial Words," Journal of KIISE : Software and Applications, Vol. 27, No. 3, pp. 290-301, 2000. (in Korean)

3.
Kwangseob Shim and Jaehyung Yang, "MACH : A Supersonic Korean Morphological Analyzer," Proc. of the 19th International Conference on Computational Linguistics, pp. 939-945, 2002.

4.
Kwangseob Shim and Jaehyung Yang, "High Speed Korean Morphological Analysis based on Adjacency Condition Check," Journal of KIISE : Software and Applications, Vol. 31, No. 1, pp. 89-99, 2004. (in Korean)

5.
Chung-Hye Han and Martha Palmer, "A Morphological Tagger for Korean: Statistical Tagging Combined with Corpus-Based Morphological Rule Application," Machine Translation, Vol.18, pp. 275-297, 2005.

6.
Do-Gil Lee and Hae-Chang Rim, "Probabilistic Modeling of Korean Morphology," IEEE Transactions on Audio, Speech and Language Processing, Vol. 17, No. 5, pp. 945-955, 2009. crossref(new window)

7.
Kwangseob Shim, "Syllable-based Korean Morphological Analysis using n-grams extracted from POS Tagged Corpus," Journal of KIISE : Software and Applications, Vol. 40, No. 12, pp. 869-876, 2013. (in Korean)

8.
Kwangseob Shim, "Syllable-based Probabilistic Models for Korean Morphological Analysis," Journal of KIISE, Vol. 41, No. 9, pp. 642-651, 2014. (in Korean) crossref(new window)

9.
MACH Available: http://cs.sungshin.ac.kr/-shim/demo/mach.html

10.
KLT2000 Available: http://nlp.kookmin.ac.kr/HAM/kor/download.html

11.
Jae-Han Kim and Cheol-Young Ok, "Korean Morphological Analysis using Inflected-Word-Dictionary," Proc. of the Korean Information Science Society Conference, Vol. 21, No. 1, pp. 813-816, 1994. (in Korean)

12.
Sujeong Kwak, Bogyum Kim and Jae Sung Lee, "Construction of an Efficient Pre-analyzed Dictionary for Korean Morphological Analysis," KIPS Transactions on Software and Data Engineering, Vol. 2, No. 12, pp. 881-888, 2013. (in Korean) crossref(new window)

13.
Joon-Choul Shin and Cheol-Young Ock, "A Korean Morphological Analyzer using a Pre-analyzed Partial Word-phrase Dictionary," Journal of KIISE : Software and Applications, Vol. 39, No. 5, pp. 415-424, 2012. (in Korean)

14.
The National Institute of the Korean Language, 21st Century Sejong Project Final Result, 2011.12 Revised Edition, 2011. (in Korean)