DocumentCode
1932047
Title
Multi-labeled document classification using semi-supervived mixture model of Watson distributions on document manifold
Author
Nguyen Kim Anh ; Ngo Van Linh ; Nguyen Khac Toi ; Nguyen The Tam
Author_Institution
Sch. of Inf. & Commun. Technol., Hanoi Univ. of Sci. & Technol., Hanoi, Vietnam
fYear
2013
fDate
15-18 Dec. 2013
Firstpage
123
Lastpage
128
Abstract
Classification of multilabel documents is essential to information retrieval and text mining. Most of existing approaches to multilabel text classification do not pay attention to relationship between class labels and input documents and also rely on labeled data all the time for classification. In fact, unlabeled data is readily available whereas generation of labeled data is expensive and error prone as it needs human annotation. In this paper, we propose a novel multilabel document classification approach based on semi-supervised mixture model of Watson distributions on document manifold which explicitly considers the manifold structure of document space to exploit efficiently both labeled and unlabeled data for classification. Our proposed approach models all labels within a dataset simultaneously, which lends itself well to the task of considering the relationship between these labels. The experimental results show that proposed method outperforms the state-of-the-art methods applying to multilabeled text classification.
Keywords
mixture models; pattern classification; statistical distributions; text analysis; Watson distributions; document manifold; labeled data; multilabeled document classification; multilabeled text classification; semisupervived mixture model; unlabeled data; Approximation methods; Art; Data models; Education; Manifolds; Support vector machines; Vectors; Laplacian Regularization; Mixture Models; Probabilistic Graphical Models; Semi-supervised Learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Soft Computing and Pattern Recognition (SoCPaR), 2013 International Conference of
Conference_Location
Hanoi
Print_ISBN
978-1-4799-3399-0
Type
conf
DOI
10.1109/SOCPAR.2013.7054113
Filename
7054113
Link To Document