DocumentCode :
246990
Title :
Topic Detection in Chinese Microblogs Using Hot Term Discovery and Adaptive Spectral Clustering
Author :
Chengxu Ye ; Ping Yang ; Shaopeng Liu
Author_Institution :
Qinghai Normal Univ., Xining, China
fYear :
2014
fDate :
8-10 Nov. 2014
Firstpage :
110
Lastpage :
119
Abstract :
Weibo is a popular Chinese microblogging service that counts with millions of users and allows them to share short text messages. As an information network, Weibo can tell people what they care about as it is happening in the society. Unfortunately, users are constantly struggling to keep up with the larger and larger amounts of messages published every day. In order to help users to get the big picture, an efficient and effective topic detection method is urgent in demand. Considering the sheer scale and rapid evolution of the microblog messages, we investigate a novel method for topic detection in Chinese Microblogs in a given time period. It is composed of two major steps. First, hot terms are extracted by a suffix array structure and a TF*SDF term weighting scheme. Second, based on the extracted hot terms, we calculate their co-occurrence information and then group the terms into clusters that represent topics using an adaptive spectral clustering. Extensive experimental results on real world data demonstrate that the proposed method is more effective and efficient for topic detection in Chinese microblogs than existing approaches.
Keywords :
Web sites; electronic messaging; information networks; information retrieval; natural language processing; pattern clustering; Chinese microblogging service; TF*SDF term weighting scheme; Weibo; adaptive spectral clustering; co-occurrence information; hot term discovery; information network; microblog messages; suffix array structure; text message; topic detection; Adaptation models; Arrays; Clustering algorithms; Data mining; Educational institutions; Real-time systems; Time-frequency analysis; adaptive spectral clustering; hot term discovery; microblog; topic detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2014 Ninth International Conference on
Conference_Location :
Guangdong
Type :
conf
DOI :
10.1109/3PGCIC.2014.44
Filename :
7024566
Link To Document :
بازگشت