Title :
Indexing Evolving Events from Tweet Streams
Author :
Hongyun Cai ; Zi Huang ; Srivastava, Divesh ; Qing Zhang
Author_Institution :
Sch. of ITEE, Univ. of Queensland, Brisbane, QLD, Australia
Abstract :
Tweet streams provide a variety of real-life and real-time information on social events that dynamically change over time. Although social event detection has been actively studied, how to efficiently monitor evolving events from continuous tweet streams remains open and challenging. One common approach for event detection from text streams is to use single-pass incremental clustering. However, this approach does not track the evolution of events, nor does it address the issue of efficient monitoring in the presence of a large number of events. In this paper, we capture the dynamics of events using four event operations (create, absorb, split, and merge), which can be effectively used to monitor evolving events. Moreover, we propose a novel event indexing structure, called Multi-layer Inverted List (MIL), to manage dynamic event databases for the acceleration of large-scale event search and update. We thoroughly study the problem of nearest neighbour search using MIL based on upper bound pruning, along with incremental index maintenance. Extensive experiments have been conducted on a large-scale real-life tweet dataset. The results demonstrate the promising performance of our event indexing and monitoring methods on both efficiency and effectiveness.
Keywords :
feature extraction; pattern clustering; search problems; social networking (online); text analysis; MIL; Tweet text stream; absorb operation; create operation; event indexing structure; incremental index maintenance; merge operation; multilayer inverted list; nearest neighbour search; single-pass incremental clustering; social event detection; split operation; upper bound pruning; Event detection; Heuristic algorithms; Indexing; Monitoring; Twitter; Upper bound; Event Evolution; Event Indexing; Event indexing; Multi-layer Inverted List; event evolution; multi-layer inverted list;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2015.2445773