شماره ركورد كنفرانس :
144
عنوان مقاله :
Improving Persian POS Tagging Using the Maximum Entropy Model
پديدآورندگان :
Kardan Ahmad A نويسنده , Bahojb Imani Maryam نويسنده
كليدواژه :
Natural language processing , Part of speech tagging , Persian Part of Speech Tagging , Maximum Entropy
عنوان كنفرانس :
مجموعه مقالات دوازدهمين كنفرانس سيستم هاي هوشمند ايران
چكيده فارسي :
Part of Speech (POS) tagging is one of the
fundamental steps in various speech and text processing
applications. POS tagging is the process of assigning the words in
input sentences with their categories according to their
contextual and grammatical properties. In addition to the general
POS tagging difficulties such as the disambiguation of multicategory
words and unknown words, the Persian language,
unlike the English language, is a free order language and it has
its own characteristics. These challenges can greatly affect the
quality of the part-of-speech tagging process. An efficient POS
tagging process has been developed for some languages,
especially for the English language, but just a few researches
have been done on the Persian language. To address these issues
and achieve high POS tagging accuracy, we chose features which
can show the important characteristics of words in a sentence, as
well as maximum entropy as a machine learning classifier.
Experimental results show that the proposed Persian POS
tagging system outperforms the other state-of-the-art Persian
taggers.
شماره مدرك كنفرانس :
3817034