DocumentCode :
231929
Title :
Multi-LDA hybrid topic model with boosting strategy and its application in text classification
Author :
Wang Yongliang ; Guo Qiao
Author_Institution :
Sch. of Autom., Beijing Inst. of Technol., Beijing, China
fYear :
2014
fDate :
28-30 July 2014
Firstpage :
4802
Lastpage :
4806
Abstract :
Topic modeling, especially Latent Dirichlet Allocation is an efficacious algorithm for feature selection and dimension reduction in text categorization tasks. Unlike the traditional Vector Space Model, LDA can easily overcome the curse of dimensionality and feature sparse problems. With the mapping from word space to the topic space, there are more benefits, but at the same time, the determination of model parameters turn into a new trouble. This article proposed a novel classification algorithm that combined different models with different parameters together via boosting strategy. Moreover, Naïve Bayes and Support Vector Machine are employed as weak classifier and a weighted method is proposed for improving the accuracy by integrating weak classifiers into strong classifier in a more ration way. Experiment results show our method well perform both in accuracy and generalization.
Keywords :
Bayes methods; feature extraction; generalisation (artificial intelligence); learning (artificial intelligence); pattern classification; support vector machines; text analysis; boosting strategy; classification algorithm; dimension reduction; dimensionality problem; feature selection; feature sparse problem; generalization; latent Dirichlet allocation; model parameters; multiLDA hybrid topic model; naïve Bayes; strong classifier; support vector machine; text categorization task; text classification; topic space; vector space model; weak classifier; weighted method; word space; Accuracy; Algorithm design and analysis; Boosting; Classification algorithms; Probabilistic logic; Resource management; Training; Boosting; Latent Dirichlet Allocation; Topic Model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Control Conference (CCC), 2014 33rd Chinese
Conference_Location :
Nanjing
Type :
conf
DOI :
10.1109/ChiCC.2014.6895752
Filename :
6895752
Link To Document :
بازگشت