DocumentCode
707644
Title
Combining POS taggers in master-slaves technique for highly inflected languages as Arabic
Author
Aliwy, Ahmed H.
Author_Institution
Dept. of Comput. Sci., Univ. of Technol., Baghdad, Iraq
fYear
2015
fDate
3-4 March 2015
Firstpage
1
Lastpage
5
Abstract
Part Of Speech tagging (POS) is the basic process for almost all natural language processing (NLP) applications. The typical methods for combining different taggers, the program doing POS tagging, are voting or stacking techniques. We propose here a Master-Slaves Technique, which can combine Hidden Markov Model (HMM) tagger as master and any number of other taggers of any type as slaves. We describe the construction, whose main idea is that, prior to tagging an input sentence by the master tagger, its probabilities are modified to privilege the tags that slave taggers used in tagging the same sentence. Then we report about tests we did with this technique, using maximum match (MM) and Brill taggers as slaves. Additionally we report a similar, custom method of integrating HMM and MM, with a higher degree of integration of the component taggers. Despite its simplicity, this method offers an increase in accuracy in several tests on Arabic and English languages. We used a well-known data set (Brown corpus) for English language and private Arabic corpus consists of 45k word.
Keywords
hidden Markov models; natural language processing; Arabic languages; Brill taggers; English languages; HMM tagger; MM; NLP; POS taggers; component taggers; hidden Markov model; highly inflected languages; master-slaves technique; maximum match technique; natural language processing; part of speech tagging; stacking techniques; voting techniques; Accuracy; Hidden Markov models; Master-slave; Natural language processing; Speech; Stacking; Tagging; Arabic POS tagging; Matser-Slaves POS tagging; combining POS taggers;
fLanguage
English
Publisher
ieee
Conference_Titel
Cognitive Computing and Information Processing (CCIP), 2015 International Conference on
Conference_Location
Noida
Type
conf
DOI
10.1109/CCIP.2015.7100682
Filename
7100682
Link To Document