Title :
PTokenizer: POS tagger Tokenizer
Author :
Saeed Rahmani Seyyed;Mostafa Fakhrahmad;Mohammad Hadi Sadredini
Author_Institution :
Department of Computer and IT Engineering, Shiraz University, Shiraz, Iran
Abstract :
By the advent of new information sources and the expansion of text data, natural language processing (NLP) has become one of the key parts of all the systems dealing with human written texts, and part of speech (POS) tagging is an inseparable part of all NLP tasks. As a result, it is of the paramount importance to enhance the accuracy of POS tagging. In this paper, applying language model and statistical information, we introduce a new approach to tokenize sentences and prepare them to be labeled by POS taggers. An evaluation shows that the proposed method yields a precision of 98 percent for tokenizing, and applying it to a Maximum Likelihood and TnT POS taggers achieve improvement in the accuracy of Persian POS tagging.
Keywords :
"Decision support systems","Power capacitors","Speech","Speech processing","Tagging","Probabilistic logic","Compounds"
Conference_Titel :
Knowledge-Based Engineering and Innovation (KBEI), 2015 2nd International Conference on
DOI :
10.1109/KBEI.2015.7436056