DocumentCode
590946
Title
Improvement in automatic classification of Persian documents by means of Naïve Bayes and Representative Vector
Author
Jafari, Aghil ; Hosseinejad, M. ; Amiri, Ali
Author_Institution
Islamic Azad Univ. of Zanjan, Zanjan, Iran
fYear
2011
fDate
13-14 Oct. 2011
Firstpage
226
Lastpage
229
Abstract
Representative Vector is a kind of Vector which includes related words and the degree of their relationships. In this paper the effect of using this kind of Vector on automatic classification of Persian documents is examined. In this method, preprocessed documents, extra words as well as word stems are at first found. Next, through one of the known ways, some features are extracted for each category. Then, the Representative Vector, which is made based on the elicited features, leads to some more detailed words which are better Representatives for each category. Findings of the experiments show that Precision and Recall can be increased significantly by extra words omission and addition of few words in the Representative Vectors as well as the use of a famous classification model like Naïve Bayes.
Keywords
Bayes methods; classification; document handling; Naive Bayes; Persian documents; automatic classification model; feature extraction; representative vector; Computers; Educational institutions; Information retrieval; Semantics; Support vector machine classification; Text categorization; Vectors; Documents Classification; Naïve Bayes Classifier; Representative Vector; Stemming;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer and Knowledge Engineering (ICCKE), 2011 1st International eConference on
Conference_Location
Mashhad
Print_ISBN
978-1-4673-5712-8
Type
conf
DOI
10.1109/ICCKE.2011.6413355
Filename
6413355
Link To Document