DocumentCode :
3714225
Title :
Introducing XGL - a lexicalised probabilistic graphical lemmatiser for isiXhosa
Author :
Lulamile Mzamo;Albert Helberg;Sonja Bosch
Author_Institution :
Faculty of Engineering, North-West University, Potchefstroom, South Africa
fYear :
2015
Firstpage :
142
Lastpage :
147
Abstract :
In this paper, a lexicalized probabilistic graphical lemmatiser for isiXhosa, XGL, is presented. An overview of isiXhosa lemmatisation issues is given, followed by a discussion on previous work in automated lemmatisation for isiXhosa. The paper continues to motivate for a machine learning lemmatiser for isiXhosa. IsiXhosa data used to train the lemmatiser is analyzed and the best features are identified from the analysis. The inner workings of XGL are detailed and evaluation results presented. XGL is shown to have achieved accuracy rates of 83.19% on a gold standard of word-lemma pairs, thereby outperforming similar lemmatisers such as LemmaGen´s 80.6% and 73.13% from the CST lemmatiser when trained with 35000 word-lemma pairs.
Keywords :
Transforms
Publisher :
ieee
Conference_Titel :
Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech), 2015
Type :
conf
DOI :
10.1109/RoboMech.2015.7359513
Filename :
7359513
Link To Document :
بازگشت