Title :
Part of speech Tagging for Myanmar using Hidden Markov Model
Author :
Zin, Khine Khine ; Thein, N.L.
Author_Institution :
Univ. of Comput. Studies, Yangon, Myanmar
Abstract :
Part-Of-Speech (POS) Tagging is the process of assigning the words with their categories that best suits the definition of the word as well as the context of the sentence in which it is used. In this paper, we describe a machine learning algorithm for Myanmar Tagging using a corpus-based approach. In order to tag Myanmar language, we need to take part word segmentation, part of speech tagging using HMM and several Tag-sets. Thus, this paper deals with a combination of supervised and un-supervised learning which use pre-tagged and untagged corpus respectively. To assign to each word with the correct tag, we describe Supervised POS Tagging by using the class labels in terms of predictor features on manually tagged corpus and also describe Unsupervised POS Tagging for automatically training without using a manually tagged corpus. By experiments, the best configuration is investigated on different amount of training data and the accuracy is 97.56%.
Keywords :
hidden Markov models; natural language processing; speech processing; unsupervised learning; HMM; Myanmar Tagging; Myanmar language; corpus based approach; hidden Markov model; machine learning algorithm; part of speech tagging; pretagged corpus; unsupervised POS tagging; unsupervised learning; untagged corpus; word segmentation; Hidden Markov models; Machine learning algorithms; Natural language processing; Natural languages; Speech analysis; Speech processing; Tagging; Testing; Training data; White spaces;
Conference_Titel :
Current Trends in Information Technology (CTIT), 2009 International Conference on the
Conference_Location :
Dubai
Print_ISBN :
978-1-4244-5754-0
Electronic_ISBN :
978-1-4244-5756-4
DOI :
10.1109/CTIT.2009.5423133