DocumentCode :
3697081
Title :
LiveIndex: A Distributed Online Index System for Temporal Microblog Data
Author :
Haifei Huang;Jianxin Li;Richong Zhang;Weiren Yu;Wuyang Ju
Author_Institution :
State Key Lab. of Software Dev. Environ., Beihang Univ., Beijing, China
fYear :
2015
Firstpage :
884
Lastpage :
887
Abstract :
Billions of microblogs are generated from many social medias such as Twitter and Weibo. How to make new microblogs available to the search engine immediately is a critical and challenging problem. Most existing studies generally put all terms´ posting list together when building index, which leads to low index update performance and high query latency. In addition, time is a key feature of microblogs, and most applications including event detection only need most recent data. In this paper, we design a distributed online index system for temporal microblog data, named LiveIndex, which can significantly reduce the time cost of queries with specific time range, such as queries in event tracing. Firstly, our index is organized as Time Range Partitions to reduce update cost. Secondly, In every partition, a hash table is used to map each term´s posting list to corresponding sub-partition. Finally, to further reduce the index cost, we adopt an index chain to merge terms with the same posting list. The experiments on the real dataset demonstrate the effectiveness and efficiency of our proposed method.
Keywords :
"Indexing","Real-time systems","Buildings","Distributed databases","Search engines","Navigation"
Publisher :
ieee
Conference_Titel :
High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on
Type :
conf
DOI :
10.1109/HPCC-CSS-ICESS.2015.70
Filename :
7336276
Link To Document :
بازگشت