Title :
A Highly Efficient Consolidated Platform for Stream Computing and Hadoop
Author :
Matsuura, Hiroya ; Ganse, Masaru ; Suzumura, Toyotaro
Author_Institution :
Tokyo Inst. of Technol., Tokyo, Japan
Abstract :
Data Stream Processing or stream computing is the new computing paradigm for processing a massive amount of streaming data in real-time without storing them in secondary storage. In this paper we propose an integrated execution platform for Data Stream Processing and Hadoop with dynamic load balancing mechanism to realize an efficient operation of computer systems and reduction of latency of Data Stream Processing. Our implementation is built on top of System S, a distributed data stream processing system developed by IBM Research. Our experimental results show that our load balancing mechanism could increase CPU usage from 47.77% to 72.14% when compared to the one with no load balancing. Moreover, the result shows that latency for stream processing jobs are kept low even in a bursty situation by dynamically allocating more compute resources to stream processing jobs.
Keywords :
data handling; distributed processing; real-time systems; resource allocation; CPU usage; Hadoop; computer systems; distributed data stream processing system; dynamic load balancing mechanism; integrated execution platform; latency reduction; real-time data processing; resource allocation; stream computing; Batch production systems; Heuristic algorithms; Load management; Prediction algorithms; Processor scheduling; Real time systems; Time series analysis; hadoop; stream computing;
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0974-5
DOI :
10.1109/IPDPSW.2012.252