Title :
Unsupervised Learning of Categorical Data With Competing Models
Author_Institution :
Air Force Res. Lab., Wright-Patterson AFB, OH, USA
Abstract :
This paper considers the unsupervised learning of high-dimensional binary feature vectors representing categorical information. A cognitively inspired framework, referred to as modeling fields theory (MFT), is utilized as the basic methodology. A new MFT-based algorithm, referred to as accelerated maximum a posteriori (MAP), is proposed. Accelerated MAP allows simultaneous learning and selection of the number of models. The key feature of accelerated MAP is a steady increase of the regularization penalty resulting in competition among models. The differences between this approach and other mixture learning and model selection methodologies are described. The operation of this algorithm and its parameter selection are discussed. Numerical experiments aimed at finding performance limits are conducted. The performance with real-world data is tested by applying the algorithm to a text categorization problem and to the clustering Congressional voting data.
Keywords :
data analysis; government data processing; maximum likelihood estimation; pattern clustering; text analysis; unsupervised learning; MFT-based algorithm; accelerated MAP; accelerated maximum a posteriori; categorical data; categorical information representation; cognitively inspired framework; congressional voting data clustering; high-dimensional binary feature vector; model competition; model selection; modeling fields theory; parameter selection; performance limit; regularization penalty; text categorization problem; unsupervised learning; Acceleration; Computational modeling; Data models; Linear programming; Mathematical model; Numerical models; Vectors; Bernoulli mixture; dynamic logic; maximum a posteriori (MAP); model selection; modeling fields theory; regularization; text categorization; vague-to-crisp process;
Journal_Title :
Neural Networks and Learning Systems, IEEE Transactions on
DOI :
10.1109/TNNLS.2012.2213266