DocumentCode
3125767
Title
LPTA: A Probabilistic Model for Latent Periodic Topic Analysis
Author
Yin, Zhijun ; Cao, Liangliang ; Han, Jiawei ; Zhai, ChengXiang ; Huang, Thomas
Author_Institution
Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
fYear
2011
fDate
11-14 Dec. 2011
Firstpage
904
Lastpage
913
Abstract
This paper studies the problem of latent periodic topic analysis from time stamped documents. The examples of time stamped documents include news articles, sales records, financial reports, TV programs, and more recently, posts from social media websites such as Flickr, Twitter, and Face book. Different from detecting periodic patterns in traditional time series database, we discover the topics of coherent semantics and periodic characteristics where a topic is represented by a distribution of words. We propose a model called LPTA (Latent Periodic Topic Analysis) that exploits the periodicity of the terms as well as term co-occurrences. To show the effectiveness of our model, we collect several representative datasets including Seminar, DBLP and Flickr. The results show that our model can discover the latent periodic topics effectively and leverage the information from both text and time well.
Keywords
database management systems; document handling; time series; Face book; Flickr; LPTA; TV programs; Twitter; financial reports; latent periodic topic analysis; probabilistic model; sales records; social media websites; time series database; time stamped documents; Analytical models; Complexity theory; Computational modeling; Databases; Equations; Mathematical model; Seminars; periodic topics; topic modeling;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location
Vancouver,BC
ISSN
1550-4786
Print_ISBN
978-1-4577-2075-8
Type
conf
DOI
10.1109/ICDM.2011.96
Filename
6137295
Link To Document