شماره ركورد كنفرانس :
2139
عنوان مقاله :
Preparing an accurate Persian POS tagger suitable for MT
عنوان به زبان ديگر :
Preparing an accurate Persian POS tagger suitable for MT
پديدآورندگان :
Shakeri Zakieh نويسنده , Riahi Noushin نويسنده , Khadivi Shahram نويسنده
كليدواژه :
POS tag , Persian POS , MT , smt
عنوان كنفرانس :
نخستين كنفرانس بين المللي پردازش خط و زبان فارسي
چكيده فارسي :
In this paper an accurate Persian POS tagger suitable for MT is prepared. First a new set of POS tags is defined which is general and more usable for MT rather than detailed ones; Then an accurate tagged corpus is prepared with modifying Bijankhan corpus. Stanford POS tagger is trained on the modified Bijankhan, the resulting tagger gives a 99.36% accuracy which shows significant improvement over previous Persian taggers. Result of utilization of this tagger for statistical machine translation is investigated. Outputs show better performance compared to simple SMT, while using previous tagger in SMT drops the BLEU compared to simple SMT.
شماره مدرك كنفرانس :
4474716