Title :
Ripple: Improved Architecture and Programming Model for Bulk Synchronous Parallel Style of Analytics
Author :
Spreitzer, M. ; Steinder, Malgorzata ; Whalley, Ian
Abstract :
We present Ripple, an architecture and a programming model for a broad set of data analytics. Ripple builds on the ideas of iterated MapReduce and adds two innovations. First it has a richer programming model, including more ideas from the Bulk Synchronous Parallel (BSP) model of computation and others. By doing so, Ripple creates a flexible and higher-level platform that is easier for both application programmers and platform implementors. Second, Ripple is based on a limited interface for key/value storage making it portable among many different key/value store implementations. By building on these two ideas Ripple improves the scope, performance, and openness of the data analytics platform. We evaluate Ripple using three representative, and non-trivial, data analysis scenarios requiring iterative computation. Using these examples, we show how Ripple achieves clear performance advantages over iterated MapReduce.
Keywords :
data analysis; distributed databases; iterative methods; parallel programming; software architecture; BSP model; MapReduce; Ripple; application programmers; architecture; bulk synchronous parallel model; data analysis; data analytics; distributed database; iterative computation; key/value storage; platform implementors; programming model; Computational modeling; Computer architecture; Data models; Distributed databases; Programming; Synchronization; Trademarks; Distributed databases; Distributed programming;
Conference_Titel :
Distributed Computing Systems (ICDCS), 2013 IEEE 33rd International Conference on
Conference_Location :
Philadelphia, PA
DOI :
10.1109/ICDCS.2013.67