Comparative Study of Various Persian Stemmers in the Field of Information Retrieval
Comparative Study of Various Persian Stemmers in the Field of Information Retrieval
Moghadam, Fatemeh Momenipour; Keyvanpour, MohammadReza;
In linguistics, stemming is the operation of reducing words to their more general form, which is called the 'stem'. Stemming is an important step in information retrieval systems, natural language processing, and text mining. Information retrieval systems are evaluated by metrics like precision and recall and the fundamental superiority of an information retrieval system over another one is measured by them. Stemmers decrease the indexed file, increase the speed of information retrieval systems, and improve the performance of these systems by boosting precision and recall. There are few Persian stemmers and most of them work based on morphological rules. In this paper we carefully study Persian stemmers, which are classified into three main classes: structural stemmers, lookup table stemmers, and statistical stemmers. We describe the algorithms of each class carefully and present the weaknesses and strengths of each Persian stemmer. We also propose some metrics to compare and evaluate each stemmer by them.
Lookup Table Stemmer;Stemmer;Statistical Stemmer;Structural Stemmer;
