Title :
Combining POS taggers in master-slaves technique for highly inflected languages as Arabic
Author_Institution :
Dept. of Comput. Sci., Univ. of Technol., Baghdad, Iraq
Abstract :
Part Of Speech tagging (POS) is the basic process for almost all natural language processing (NLP) applications. The typical methods for combining different taggers, the program doing POS tagging, are voting or stacking techniques. We propose here a Master-Slaves Technique, which can combine Hidden Markov Model (HMM) tagger as master and any number of other taggers of any type as slaves. We describe the construction, whose main idea is that, prior to tagging an input sentence by the master tagger, its probabilities are modified to privilege the tags that slave taggers used in tagging the same sentence. Then we report about tests we did with this technique, using maximum match (MM) and Brill taggers as slaves. Additionally we report a similar, custom method of integrating HMM and MM, with a higher degree of integration of the component taggers. Despite its simplicity, this method offers an increase in accuracy in several tests on Arabic and English languages. We used a well-known data set (Brown corpus) for English language and private Arabic corpus consists of 45k word.
Keywords :
hidden Markov models; natural language processing; Arabic languages; Brill taggers; English languages; HMM tagger; MM; NLP; POS taggers; component taggers; hidden Markov model; highly inflected languages; master-slaves technique; maximum match technique; natural language processing; part of speech tagging; stacking techniques; voting techniques; Accuracy; Hidden Markov models; Master-slave; Natural language processing; Speech; Stacking; Tagging; Arabic POS tagging; Matser-Slaves POS tagging; combining POS taggers;
Conference_Titel :
Cognitive Computing and Information Processing (CCIP), 2015 International Conference on
Conference_Location :
Noida
DOI :
10.1109/CCIP.2015.7100682