DocumentCode :
2840422
Title :
MOVE: A Large Scale Keyword-Based Content Filtering and Dissemination System
Author :
Rao, Weixiong ; Chen, Lei ; Hui, Pan ; Tarkoma, Sasu
Author_Institution :
Dept. of Comput. Sci., Univ. of Helsinki, Helsinki, Finland
fYear :
2012
fDate :
18-21 June 2012
Firstpage :
445
Lastpage :
454
Abstract :
The Web 2.0 era is characterized by the emergence of a very large amount of live content. A real time and fine grained content filtering approach can precisely keep users up-to-date the information that they are interested. The key of the approach is to offer a scalable match algorithm. One might treat the content match as a special kind of content search, and resort to the classic algorithm [5]. However, due to blind flooding, [5] cannot be simply adapted for scalable content match. To increase the throughput of scalable match, we propose an adaptive approach to allocate (i.e, replicate and partition) filters. The allocation is based on our observation on real datasets: most users prefer to use short queries, consisting of around 2-3 terms per query, and web content typically contains tens and even thousands of terms per article. Thus, by reducing the number of processed documents, we can reduce the latency of matching large articles with filters, and have chance to achieve higher throughput. We implement our approach on an open source project, Apache Cassandra. The experiment with real datasets shows that our approach can achieve around folds of better throughput than two counterpart state-of-the-arts solutions.
Keywords :
Internet; information dissemination; information filtering; Apache Cassandra; MOVE; Web 2.0; Web content; dissemination system; large scale keyword-based content filtering; scalable match algorithm; Clustering algorithms; Equations; Indexes; Optimization; Registers; Resource management; Throughput;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Distributed Computing Systems (ICDCS), 2012 IEEE 32nd International Conference on
Conference_Location :
Macau
ISSN :
1063-6927
Print_ISBN :
978-1-4577-0295-2
Type :
conf
DOI :
10.1109/ICDCS.2012.32
Filename :
6258017
Link To Document :
بازگشت