DocumentCode
2504205
Title
On provable exact low-rank recovery in topic models
Author
Behmardi, Behrouz ; Raich, Raviv
Author_Institution
Sch. of EECS, Oregon State Univ., Corvallis, OR, USA
fYear
2011
fDate
28-30 June 2011
Firstpage
265
Lastpage
268
Abstract
In the past few years, probabilistic topic models have been developed and applied to problems in text document classification and computer vision. Such models provide a probabilistic framework for characterizing a corpus of documents (or images) in the bag-of-words representation. Key feature of such models is that a low dimensional representation is facilitated through latent topic variables. Most inference algorithms in topic models assume a fixed number of topics and determine the number of topics empirically. In this paper, we consider the problem of identifying the number of topics in topic models. We present a rank minimization framework and provide sufficient conditions, which guarantee exact recovery of the number of topics. Moreover, we propose a heuristic convex relaxation to the rank minimization. Using simulations, we show that the proposed convex relaxation provides exact rank recovery under the sufficient conditions proposed for the rank minimization problem.
Keywords
computer vision; inference mechanisms; pattern classification; probability; text analysis; bag-of-words representation; computer vision; document corpus characterization; heuristic convex relaxation; inference algorithms; low rank matrix recovery; probabilistic topic models; provable exact low-rank recovery; rank minimization framework; text document classification; Computational modeling; Linear matrix inequalities; Matching pursuit algorithms; Minimization; Noise; Optimization; Probabilistic logic; low rank matrix recovery; nuclear norm minimization; topic models;
fLanguage
English
Publisher
ieee
Conference_Titel
Statistical Signal Processing Workshop (SSP), 2011 IEEE
Conference_Location
Nice
ISSN
pending
Print_ISBN
978-1-4577-0569-4
Type
conf
DOI
10.1109/SSP.2011.5967677
Filename
5967677
Link To Document