DocumentCode
2717730
Title
An Efficient Clustering Algorithm for Microblogging Hot Topic Detection
Author
Tu, Hao ; Ding, Jin
Author_Institution
Network & Comput. Center, Huazhong Univ. of Sci. & Tech., Wuhan, China
fYear
2012
fDate
11-13 Aug. 2012
Firstpage
738
Lastpage
741
Abstract
Microblog has become exceeding popular, with hundreds of millions of tweets being posted every minute on variety of topics. Most hot event will be retweeted thousands of times in short time, which will help us to trace hot event. This paper focuses on tracing those events by mining the text stream in microblog. Although event detection has long been a research topic, the characteristics of microblog bring new challenge. Tweets reporting such events are usually overwhelmed by high flood of meaningless tweets, algorithm needs to be scalable given the sheer amount of tweets. Firstly, we use Bayes classification to filter the meaningless tweets, then detect hot event from the tweets by a mean calculation based incomplete clustering. The experiments show that algorithm can detect hot events real-time from big amount tweets and remain good accuracy.
Keywords
Bayes methods; data mining; pattern clustering; social networking (online); text analysis; Bayes classification; Tweets; efficient clustering algorithm; event detection; mean calculation based incomplete clustering; microblogging hot topic detection; text stream mining; Accuracy; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Event detection; Filtering algorithms; Twitter; clustering algorithm; microblog; topic detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science & Service System (CSSS), 2012 International Conference on
Conference_Location
Nanjing
Print_ISBN
978-1-4673-0721-5
Type
conf
DOI
10.1109/CSSS.2012.189
Filename
6394427
Link To Document