• DocumentCode
    679809
  • Title

    STHREE: Stemmer for Malayalam using three pass algorithm

  • Author

    Pragisha, K. ; Reghuraj, P.C.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Gov. Eng. Coll. Sreekrishnapuram, Palakkad, India
  • fYear
    2013
  • fDate
    13-15 Dec. 2013
  • Firstpage
    149
  • Lastpage
    152
  • Abstract
    This paper reports the design of a three pass stemmer STHREE for Malayalam. The language is rich in morphological variations but poor in linguistic computational resources. The system returns the meaningful root word of the input word in 97% of the cases when tested with 1040 words. This is a significant improvement over the reported accuracy of SILPA system, the only known stemmer for Malayalam, with the same test data sets.
  • Keywords
    computational linguistics; linguistics; natural language processing; Malayalam; SILPA system; STHREE; data sets; input word; linguistic computational resources; morphological variations; root word; three pass stemmer; Accuracy; Algorithm design and analysis; Computational linguistics; Computer science; Educational institutions; Knowledge discovery; Natural language processing; Stemmer; linguistic computational resources; morphological variation; root word;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control Communication and Computing (ICCC), 2013 International Conference on
  • Conference_Location
    Thiruvananthapuram
  • Print_ISBN
    978-1-4799-0573-7
  • Type

    conf

  • DOI
    10.1109/ICCC.2013.6731640
  • Filename
    6731640