Title :
Online hot topic detection from web news archive in short terms
Author :
RuiGuo Yu ; ManKun Zhao ; Peng Chang ; MuWen He
Author_Institution :
Sch. of Comput. Sci. & Technol., Tianjin Univ., Tianjin, China
Abstract :
In the age of information explosion, vast amount of information is now available on the internet. Early warning of breaking events is becoming a popular subject to study. Timeliness is the one of the most important factors to be considered in this subject. However, traditional topic detection approaches are always not so effective for the detection of emerging topics which concentrate all the news stories in a short term that make up the breaking topics. Therefore, a specific approach that applies to topic detection in short terms is required for this subject. In this paper, we utilize a temporal distance factor to measure the similarity between news stories and topics and then propose a novel approach based on it that could perform better in short-term topic detection. In the meantime, the aging theory was adopted in our scheme to build the life cycle model of events, from which we can get access to the hotness ranking at any time and lay a foundation for the research on early warning of breaking topics in the future. The following experiments indicate that our approach is effective and the life cycle model of events can basically conform to the reality.
Keywords :
Internet; Web sites; data mining; electronic publishing; Internet; Web news archive; aging theory; breaking events; hotness ranking; information explosion; life cycle model; online hot topic detection; short-term topic detection; temporal distance factor; Aging; Algorithm design and analysis; Clustering algorithms; Educational institutions; Reliability; Transforms; Vectors; aging theory; hot topic detection; life cycle model; short terms;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2014 11th International Conference on
Conference_Location :
Xiamen
Print_ISBN :
978-1-4799-5147-5
DOI :
10.1109/FSKD.2014.6980962