DocumentCode :
2959213
Title :
Classifiers based on Bernoulli mixture models for text mining and handwriting recognition tasks
Author :
Saeed, Mehreen ; Babri, Haroon
Author_Institution :
Nat. Univ. of Comput. & Emerging Sci., Lahore
fYear :
2008
fDate :
1-8 June 2008
Firstpage :
2169
Lastpage :
2175
Abstract :
In this paper we describe a model for classifying binary data using classifiers based on Bernoulli mixture models. We show how Bernoulli mixtures can be used for feature extraction and dimensionality reduction of raw input data. The extracted features are then used for training a classifier for supervised labeling of individual sample points. We have applied this method to two different types of datasets, i.e., one from the text mining domain and one from the handwriting recognition area. Empirical experiments demonstrate that we can obtain up to 99.9% reduction in the dimensionality of the original feature set for sparse binary features. Classification accuracy also increases considerably when the combined model is used. This paper compares the performance of different classification algorithms when used in conjunction with the new feature set generated by Bernoulli mixtures. Using this hybrid model of learning we have achieved one of the best accuracy rates on the NOVA and GINA datasets of the dasiaagnostic vs. prior knowledgepsila competition held by the International Joint Conference on Neural Networks in 2007.
Keywords :
data mining; feature extraction; handwriting recognition; learning (artificial intelligence); neural nets; text analysis; Bernoulli mixture models; binary data Classification; dimensionality reduction; feature extraction; handwriting recognition tasks; neural networks; supervised labeling; text mining; Classification algorithms; Data mining; Feature extraction; Handwriting recognition; Labeling; Neural networks; Support vector machine classification; Support vector machines; Text categorization; Text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
Conference_Location :
Hong Kong
ISSN :
1098-7576
Print_ISBN :
978-1-4244-1820-6
Electronic_ISBN :
1098-7576
Type :
conf
DOI :
10.1109/IJCNN.2008.4634097
Filename :
4634097
Link To Document :
بازگشت