DocumentCode :
610359
Title :
A unified model for stable and temporal topic detection from social media data
Author :
Hongzhi Yin ; Bin Cui ; Hua Lu ; Yuxin Huang ; Junjie Yao
Author_Institution :
Dept. of Comput. Sci. & Technol., Peking Univ., Beijing, China
fYear :
2013
fDate :
8-12 April 2013
Firstpage :
661
Lastpage :
672
Abstract :
Web 2.0 users generate and spread huge amounts of messages in online social media. Such user-generated contents are mixture of temporal topics (e.g., breaking events) and stable topics (e.g., user interests). Due to their different natures, it is important and useful to distinguish temporal topics from stable topics in social media. However, such a discrimination is very challenging because the user-generated texts in social media are very short in length and thus lack useful linguistic features for precise analysis using traditional approaches. In this paper, we propose a novel solution to detect both stable and temporal topics simultaneously from social media data. Specifically, a unified user-temporal mixture model is proposed to distinguish temporal topics from stable topics. To improve this model´s performance, we design a regularization framework that exploits prior spatial information in a social network, as well as a burst-weighted smoothing scheme that exploits temporal prior information in the time dimension. We conduct extensive experiments to evaluate our proposal on two real data sets obtained from Del.icio.us and Twitter. The experimental results verify that our mixture model is able to distinguish temporal topics from stable topics in a single detection process. Our mixture model enhanced with the spatial regularization and the burst-weighted smoothing scheme significantly outperforms competitor approaches, in terms of topic detection accuracy and discrimination in stable and temporal topics.
Keywords :
Internet; information retrieval; linguistics; social networking (online); text analysis; Del.icio.us; Twitter; UGC; Web 2.0; burst-weighted smoothing scheme; linguistic features; online social media; spatial regularization; stable topic detection; temporal topic detection; user-generated contents; user-temporal mixture model; Equations; Feature extraction; Hidden Markov models; Mathematical model; Media; Twitter;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2013 IEEE 29th International Conference on
Conference_Location :
Brisbane, QLD
ISSN :
1063-6382
Print_ISBN :
978-1-4673-4909-3
Electronic_ISBN :
1063-6382
Type :
conf
DOI :
10.1109/ICDE.2013.6544864
Filename :
6544864
Link To Document :
بازگشت