DocumentCode
130894
Title
Fault tolerant data flow using curator — Storm
Author
Sainik, Lavanya ; Khajuria, Dheeraj
Author_Institution
Centre of Excellence Mediation & Device, Ericsson India Global Services Pvt. Ltd., Gurgaon, India
fYear
2014
fDate
27-29 June 2014
Firstpage
472
Lastpage
475
Abstract
Driven by the 3GPP (3rd Generation Partnership Project) evolving standards and advent of Big Data technology, to deal with huge volume, velocity and variety of data, various industries like telecommunication, warehousing and storage, financial and many more industries need to be compliant with this evolving technology. There is a huge demand to process both real time and stored data. In this paper we have analyzed an open source framework Storm, which is a real time distributed processing engine and suggesting an improvement on its fault tolerance mechanism so that it can be flawlessly used for any data processing use case. Vanilla storm provides guaranteed message processing however it promises “at least once” level of processing. For perfect fault tolerant system “exactly one” level of processing is required and to achieve this storm provides another framework, Trident which is built on top of it. Trident provides transactional spout where transactional metadata information <; transaction id, data > is stored in zookeeper which provides distributed coordination, thus across node / hardware data can be replayed in case of any failure, timeout, retry. Trident uses zookeeper for coordination of transactional information through apache curator framework. However with current trident framework per activity level (aggregator/reducer) commit can be easily obtained but no direct implementation for single chain level transaction commit. This paper describes an approach where by modifying existing transactional trident, chain level commit can be obtained using curator recipes.
Keywords
Big Data; data flow computing; fault tolerant computing; meta data; public domain software; 3GPP; 3rd generation partnership project; Big Data technology; Vanilla storm; apache curator framework; data processing; distributed coordination; fault tolerance mechanism; fault tolerant data flow; guaranteed message processing; hardware data; node data; open source framework; real time distributed processing engine; transaction id; transactional information; transactional metadata information; transactional spout; trident framework; zookeeper; Distributed databases; Fasteners; Fault tolerance; Fault tolerant systems; Radiation detectors; Real-time systems; Storms; Big data; Fault tolerance; PathChildrenCache; Real time data; Storm; apache curator; batch input; transaction management; transactional spout; zookeeper;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Engineering and Service Science (ICSESS), 2014 5th IEEE International Conference on
Conference_Location
Beijing
ISSN
2327-0586
Print_ISBN
978-1-4799-3278-8
Type
conf
DOI
10.1109/ICSESS.2014.6933608
Filename
6933608
Link To Document