DocumentCode
126750
Title
Stemmer for resource scarce language using string similarity measure
Author
Debbarma, Abhijit ; Purkayastha, Bs ; Bhattacharya, Pallab
Author_Institution
Dept. of Inf. Technol., Ramkrishna Mahavidyalaya, Unakoti, India
fYear
2014
fDate
6-8 Feb. 2014
Firstpage
96
Lastpage
98
Abstract
This paper a work in progress describes a stemming of Kokborok language using a statistical approach. Stemming study of Kokborok is a new topic of research. Many stemming algorithms have been proposed for various languages. But the major work has been done only for English language. In recent times we have seen interest for non English languages too. However, very limited or no computational work has been observed for Kokborok language, a dialect spoken in the Tripura, India. Kokborok is a highly inflectional language. Linguistic knowledge and resources forms one of the basic requirement in building rule based stemmer. Kokborok a new language in this area of computational study suffer from this limitation. This work tries to build a Kokborok stemmer using a statistical approach based on string measure.
Keywords
knowledge based systems; natural language processing; statistical analysis; English language; India; Kokborok language; Tripura; inflectional language; linguistic knowledge; resource scarce language; rule based stemmer; statistical approach; stemm algorithm; string similarity measure; String Similarity; Supervised learning; kokborok; nlp; stemmer;
fLanguage
English
Publisher
ieee
Conference_Titel
Optimization, Reliabilty, and Information Technology (ICROIT), 2014 International Conference on
Conference_Location
Faridabad
Print_ISBN
978-1-4799-3958-9
Type
conf
DOI
10.1109/ICROIT.2014.6798299
Filename
6798299
Link To Document