Title :
Finding Core Topics: Topic Extraction with Clustering on Tweet
Author :
Sungchul Kim ; Sungho Jeon ; Jinha Kim ; Young-Ho Park ; Hwanjo Yu
Author_Institution :
Dept. of Comput. Sci. & Eng., POSTECH, Pohang, South Korea
Abstract :
Twitter is one of the most popular microblogging services that lets users post short text called Tweet. Tweet is distinguished from conventional text data in that it is typically composed of short and informal message, and it makes typical text analysis methods do not work well. Accordingly, extracting meaningful topics from tweets brings up new challenges. In this work, we propose a simple and novel method called Core-Topic-based Clustering (CTC), which extracts topics and cluster tweets simultaneously based on the clustering principles: minimizing the inter-cluster similarity and maximizing the intra-cluster similarity. Experimental results show that our method efficiently extracts meaningful topics, and the clustering performance is better than K-means algorithm.
Keywords :
pattern clustering; social networking (online); K-means clustering algorithm; Twitter; clustering principle; core-topic-based clustering; intercluster similarity; intracluster similarity; microblogging service; text analysis method; topic extraction; tweet clustering; Clustering algorithms; Encyclopedias; Internet; Twitter; Vectors; document clustering; social network; topic extraction;
Conference_Titel :
Cloud and Green Computing (CGC), 2012 Second International Conference on
Conference_Location :
Xiangtan
Print_ISBN :
978-1-4673-3027-5
DOI :
10.1109/CGC.2012.120