Title :
Design of a POS tagger using conditional random fields for Malayalam
Author :
Krishnapriya, V. ; Sreesha, P. ; Harithalakshmi, T.R. ; Archana, T.C. ; Vettath, Jayasree N.
Author_Institution :
Dept. of Comput. Sci. & Eng., Sreepathy Inst. of Manage. & Technol., Palakkad, India
Abstract :
Parts of Speech tagging, is a process of marking the words in a text as corresponding to a particular part of speech, based on its definition and context POS tagger plays an important role in Natural language applications like speech recognition, natural language parsing, information retrieval and extraction. This paper discusses architecture for designing a Part-Of-Speech (POS tagger for Malayalam language using Conditional Random Field (CRF). The experiments presented in this paper use an annotated corpus of 1028 sentences (11,315 words) and tagset consists of 100 tags. A trigram based tagging scheme is involved in the experiments. The proposed system is based on an empirical approach that models the human POS tagging processing more realistically than the existing systems, without compromising the efficiency and accuracy.
Keywords :
information retrieval; natural language processing; speech processing; CRF; Malayalam language; POS tagger design; annotated corpus; conditional random field; conditional random fields; information extraction; information retrieval; natural language applications; natural language parsing; speech recognition; speech tagging; text words; Accuracy; Hidden Markov models; Speech; Speech processing; Support vector machines; Tagging; Training; CRF; Hidden Markov Model; Malayalam; POS tagging; Stochastic process;
Conference_Titel :
Computational Systems and Communications (ICCSC), 2014 First International Conference on
Conference_Location :
Trivandrum
Print_ISBN :
978-1-4799-6012-5
DOI :
10.1109/COMPSC.2014.7032680