DocumentCode
74119
Title
Temporal Workload-Aware Replicated Partitioning for Social Networks
Author
Turk, Ata ; Oguz Selvitopi, R. ; Ferhatosmanoglu, Hakan ; Aykanat, Cevdet
Author_Institution
Yahoo Labs., Barcelona, Spain
Volume
26
Issue
11
fYear
2014
fDate
Nov. 2014
Firstpage
2832
Lastpage
2845
Abstract
Most frequent and expensive queries in social networks involve multi-user operations such as requesting the latest tweets or news-feeds of friends. The performance of such queries are heavily dependent on the data partitioning and replication methodologies adopted by the underlying systems. Existing solutions for data distribution in these systems involve hashor graph-based approaches that ignore the multi-way relations among data. In this work, we propose a novel data partitioning and selective replication method that utilizes the temporal information in prior workloads to predict future query patterns. Our method utilizes the social network structure and the temporality of the interactions among its users to construct a hypergraph that correctly models multi-user operations. It then performs simultaneous partitioning and replication of this hypergraph to reduce the query span while respecting load balance and I/O load constraints under replication. To test our model, we enhance the Cassandra NoSQL system to support selective replication and we implement a social network application (a Twitter clone) utilizing our enhanced Cassandra. We conduct experiments on a cloud computing environment (Amazon EC2) to test the developed systems. Comparison of the proposed method with hash- and enhanced graph-based schemes indicate that it significantly improves latency and throughput.
Keywords
cloud computing; data handling; data structures; graph theory; query processing; resource allocation; social networking (online); Amazon EC2; Cassandra NoSQL system; IO load constraints; cloud computing environment; data distribution; data partitioning; data replication methodologies; expensive queries; graph-based approaches; hash-based approaches; hypergraph; load balance; multiuser operations; multiway relations; news-feeds; query patterns; query span; selective replication method; social network application; temporal workload-aware replicated partitioning; Cloning; Licenses; Load modeling; Measurement; Servers; Twitter; Cassandra; NoSQL; replicated hypergraph partitioning; selective replication; social network partitioning; twitter;
fLanguage
English
Journal_Title
Knowledge and Data Engineering, IEEE Transactions on
Publisher
ieee
ISSN
1041-4347
Type
jour
DOI
10.1109/TKDE.2014.2302291
Filename
6720183
Link To Document