DocumentCode :
1384093
Title :
Count Data Modeling and Classification Using Finite Mixtures of Distributions
Author :
Bouguila, Nizar
Author_Institution :
Concordia Inst. for Inf. Syst. Eng., Concordia Univ., Montreal, QC, Canada
Volume :
22
Issue :
2
fYear :
2011
Firstpage :
186
Lastpage :
198
Abstract :
In this paper, we consider the problem of constructing accurate and flexible statistical representations for count data, which we often confront in many areas such as data mining, computer vision, and information retrieval. In particular, we analyze and compare several generative approaches widely used for count data clustering, namely multinomial, multinomial Dirichlet, and multinomial generalized Dirichlet mixture models. Moreover, we propose a clustering approach via a mixture model based on a composition of the Liouville family of distributions, from which we select the Beta-Liouville distribution, and the multinomial. The novel proposed model, which we call multinomial Beta-Liouville mixture, is optimized by deterministic annealing expectation-maximization and minimum description length, and strives to achieve a high accuracy of count data clustering and model selection. An important feature of the multinomial Beta-Liouville mixture is that it has fewer parameters than the recently proposed multinomial generalized Dirichlet mixture. The performance evaluation is conducted through a set of extensive empirical experiments, which concern text and image texture modeling and classification and shape modeling, and highlights the merits of the proposed models and approaches.
Keywords :
Liouville equation; expectation-maximisation algorithm; feature extraction; image texture; pattern clustering; solid modelling; text analysis; Beta Liouville distribution; count data modeling; data classification; data clustering; expectation maximization approach; finite distribution mixture; image classification; image texture modeling; multinomial generalized Dirichlet mixture; shape modeling; Annealing; Computational modeling; Data mining; Data models; Equations; Shape; Count data; Dirichlet; Fisher kernel; Liouville; deterministic annealing expectation-maximization; finite mixture models; generalized Dirichlet; model selection; multinomial; shape modeling; support vector machine; text categorization; texture classification; Algorithms; Artificial Intelligence; Automatic Data Processing; Computer Simulation; Data Mining; Humans; Mathematical Concepts; Models, Theoretical; Neural Networks (Computer); Pattern Recognition, Automated;
fLanguage :
English
Journal_Title :
Neural Networks, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9227
Type :
jour
DOI :
10.1109/TNN.2010.2091428
Filename :
5640674
Link To Document :
بازگشت