Title :
Automatic text classification using modified centroid classifier
Author :
Elmarhumy, Mahmoud ; Fattah, M.A. ; Ren, Fuji
Author_Institution :
Fac. of Eng., Univ. of Tokushima, Tokushima, Japan
Abstract :
This work proposes an approach to address the problem of inductive bias or model misfit incurred by the centroid classifier assumption to enhance the automatic text classification task. This approach is a trainable classifier, which takes into account tfidf as a text feature. The main idea of the proposed approach is to take advantage of the most similar training errors to the classification model to successively update it based on a certain threshold. The proposed approach is simple to implement and flexible. The proposed approach performance is measured at several threshold values on the Reuters -21578 text categorization test collection. The experimental results show that the proposed approach can improve the performance of centroid classifier.
Keywords :
data mining; pattern classification; text analysis; automatic text classification; data mining; modified centroid classifier; text categorization; text feature; Bayesian methods; Classification tree analysis; Data mining; Error correction; Internet; Machine learning; Organizing; Supervised learning; Testing; Text categorization; Data mining; Text classification; centroid classifier; text categorization;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-4538-7
Electronic_ISBN :
978-1-4244-4540-0
DOI :
10.1109/NLPKE.2009.5313757