• DocumentCode
    126750
  • Title

    Stemmer for resource scarce language using string similarity measure

  • Author

    Debbarma, Abhijit ; Purkayastha, Bs ; Bhattacharya, Pallab

  • Author_Institution
    Dept. of Inf. Technol., Ramkrishna Mahavidyalaya, Unakoti, India
  • fYear
    2014
  • fDate
    6-8 Feb. 2014
  • Firstpage
    96
  • Lastpage
    98
  • Abstract
    This paper a work in progress describes a stemming of Kokborok language using a statistical approach. Stemming study of Kokborok is a new topic of research. Many stemming algorithms have been proposed for various languages. But the major work has been done only for English language. In recent times we have seen interest for non English languages too. However, very limited or no computational work has been observed for Kokborok language, a dialect spoken in the Tripura, India. Kokborok is a highly inflectional language. Linguistic knowledge and resources forms one of the basic requirement in building rule based stemmer. Kokborok a new language in this area of computational study suffer from this limitation. This work tries to build a Kokborok stemmer using a statistical approach based on string measure.
  • Keywords
    knowledge based systems; natural language processing; statistical analysis; English language; India; Kokborok language; Tripura; inflectional language; linguistic knowledge; resource scarce language; rule based stemmer; statistical approach; stemm algorithm; string similarity measure; String Similarity; Supervised learning; kokborok; nlp; stemmer;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Optimization, Reliabilty, and Information Technology (ICROIT), 2014 International Conference on
  • Conference_Location
    Faridabad
  • Print_ISBN
    978-1-4799-3958-9
  • Type

    conf

  • DOI
    10.1109/ICROIT.2014.6798299
  • Filename
    6798299