DocumentCode :
2259651
Title :
Classification of Persian textual documents using learning vector quantization
Author :
Pilevar, Mohammad Taher ; Feili, Heshaam ; Soltani, Mahmood
Author_Institution :
Univ. of Tehran, Tehran, Iran
fYear :
2009
fDate :
24-27 Sept. 2009
Firstpage :
1
Lastpage :
6
Abstract :
Classification of the text documents into a predefined set of classes is considered to be an important task for natural language processing applications. There is usually a tradeoff between accuracy and complexity of text classification systems. In this paper, an experiment of classification of Persian documents by using the Learning Vector Quantization network is presented. In this method, each class is presented by an exemplar vector called codebook. The codebook vectors are placed in the feature space in a way that decision boundaries are approximated by the nearest neighbor rule. Compared to the K-Nearest Neighbour method, the LVQ requires less training examples and is believed to be much faster than other classification methods. The experimental results obtained from the classification of Persian textual documents using the LVQ algorithm are promising and prove that it can perform as an alternative to other methods like Support Vector Machines.
Keywords :
natural language processing; pattern classification; text analysis; vector quantisation; K-nearest neighbour method; Persian documents; Persian textual documents; codebook vectors; learning vector quantization; natural language processing; nearest neighbor rule; text classification systems; text document classification; Artificial neural networks; Natural language processing; Neural networks; Prototypes; Support vector machine classification; Support vector machines; Text categorization; Vector quantization; Voting; Web pages; Hamshahri2 Persian textual corpus; Learning vector quantization; artificial neural networks; natural language processing; text classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-4538-7
Electronic_ISBN :
978-1-4244-4540-0
Type :
conf
DOI :
10.1109/NLPKE.2009.5313761
Filename :
5313761
Link To Document :
بازگشت