DocumentCode :
3525706
Title :
Modeling Both Coarse-Grained and Fine-Grained Topics in Massive Text Data
Author :
Weifan Zhang ; Hui Zhang ; Yuan Zuo ; Deqing Wang
Author_Institution :
Sch. of Comput. Sci., Beihang Univ., Beijing, China
fYear :
2015
fDate :
March 30 2015-April 2 2015
Firstpage :
378
Lastpage :
383
Abstract :
Topic model has attracted much attention from investigators, as it provides users with insights into the huge volumes of documents. However, most previous related studies that based on Non-negative Matrix Factorization (NMF) neglect to figure out which topics are widespread in the documents and which are not. These widespread topics, which we refer to coarse-grained topics, have great significance for people who concentrate on common topics in a given text set. For example, after reading the massive job ads, the jobseekers are eager to learn employers´ basic requirements which can be regarded as the coarse-grained topics, as well as the additional requirements that can be deemed to be the fine-grained topics. In this paper, we propose a novel method which applies two different sparseness constraints to NMF to tell coarse-grained topics and fine-grained topics apart. The experimental results of demonstrate that the new model can not only discover coarse-grained topics but also extract fine-grained topics. We evaluate the performance of the new model via text clustering and classification, and the results show the new model can learn more accurate topic representations of documents.
Keywords :
matrix decomposition; pattern classification; pattern clustering; text analysis; NMF; coarse-grained topic modeling; coarse-grained topics; document topic representations; fine-grained topic modeling; massive text data; nonnegative matrix factorization; performance evaluation; text classification; text clustering; Computers; Electronic publishing; Encyclopedias; Internet; Matrix decomposition; Optimization; non-negative matrix factorization; text clustering; text mining; topic model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data Computing Service and Applications (BigDataService), 2015 IEEE First International Conference on
Conference_Location :
Redwood City, CA
Type :
conf
DOI :
10.1109/BigDataService.2015.21
Filename :
7184905
Link To Document :
بازگشت