Title :
STHREE: Stemmer for Malayalam using three pass algorithm
Author :
Pragisha, K. ; Reghuraj, P.C.
Author_Institution :
Dept. of Comput. Sci. & Eng., Gov. Eng. Coll. Sreekrishnapuram, Palakkad, India
Abstract :
This paper reports the design of a three pass stemmer STHREE for Malayalam. The language is rich in morphological variations but poor in linguistic computational resources. The system returns the meaningful root word of the input word in 97% of the cases when tested with 1040 words. This is a significant improvement over the reported accuracy of SILPA system, the only known stemmer for Malayalam, with the same test data sets.
Keywords :
computational linguistics; linguistics; natural language processing; Malayalam; SILPA system; STHREE; data sets; input word; linguistic computational resources; morphological variations; root word; three pass stemmer; Accuracy; Algorithm design and analysis; Computational linguistics; Computer science; Educational institutions; Knowledge discovery; Natural language processing; Stemmer; linguistic computational resources; morphological variation; root word;
Conference_Titel :
Control Communication and Computing (ICCC), 2013 International Conference on
Conference_Location :
Thiruvananthapuram
Print_ISBN :
978-1-4799-0573-7
DOI :
10.1109/ICCC.2013.6731640