DocumentCode :
1679532
Title :
A new geometric approach to latent topic modeling and discovery
Author :
Weicong Ding ; Rohban, Mohammad Hossein ; Ishwar, Prakash ; Saligrama, Venkatesh
Author_Institution :
Dept. of Electr. & Comput. Eng., Boston Univ., Boston, MA, USA
fYear :
2013
Firstpage :
5568
Lastpage :
5572
Abstract :
A new geometrically-motivated algorithm for topic modeling is developed and applied to the discovery of latent “topics” in text and image “document” corpora. The algorithm is based on robustly finding and clustering extreme-points of empirical cross-document word-frequencies that correspond to novel words unique to each topic. In contrast to related approaches that are based on solving non-convex optimization problems using suboptimal approximations, locally-optimal methods, or heuristics, the new algorithm is convex, has polynomial complexity, and has competitive qualitative and quantitative performance compared to the current state- of-the-art approaches on synthetic and real-world datasets.
Keywords :
approximation theory; data mining; document image processing; optimisation; pattern clustering; text analysis; empirical cross-document word-frequency; geometrically-motivated algorithm; image document corpora; latent topic discovery; latent topic modeling; locally-optimal method; nonconvex optimization problem; polynomial complexity; suboptimal approximation; Abstracts; Games; Integrated circuits; Logic gates; Nominations and elections; Support vector machines; Topic modeling; extreme points; nonnegative matrix factorization (NMF); subspace clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6638729
Filename :
6638729
Link To Document :
بازگشت