A practical part-of-speech tagger for Bengali

Author

Sarkar, Kamal ; Gayen, Vivekananda

Author_Institution

Dept. of Comput. Sci. & Eng., Jadavpur Univ., Kolkata, India

fYear

2012

fDate

Nov. 30 2012-Dec. 1 2012

Firstpage

36

Lastpage

40

Abstract

This paper presents a practical part-of-speech (POS) tagger for Bengali, which will accept a raw Bengali text (typed in Bengali font) to produce a Bengali POS tagged output which can be directly used for other NLP applications. We have implemented a supervised Bengali trigram POS Tagger from the scratch using a statistical machine learning technique that uses the second order Hidden Markov Model (HMM). We have considered the bigram POS tagger as the baseline tagger to which our developed trigram POS tagger has been compared.

Keywords

hidden Markov models; learning (artificial intelligence); natural language processing; statistical analysis; text analysis; Bengali font; Bengali text; HMM; Hidden Markov Model; NLP applications; practical part-of-speech tagger; statistical machine learning technique; supervised Bengali trigram POS Tagger; Equations; Hidden Markov models; Natural language processing; Speech; Tagging; Training; Viterbi algorithm; Bengali Language; Part-of-speech tagging; Second order hidden markov model;

fLanguage

English

Publisher

ieee

Conference_Titel

Emerging Applications of Information Technology (EAIT), 2012 Third International Conference on

Conference_Location

Kolkata

Print_ISBN

978-1-4673-1828-0

Type

conf

DOI

10.1109/EAIT.2012.6407856

Filename

6407856