DocumentCode :
588171
Title :
MARISSA: MApReduce Implementation for Streaming Science Applications
Author :
Dede, E. ; Fadika, Z. ; Hartog, J. ; Govindaraju, M. ; Ramakrishnan, Lavanya ; Gunter, Dan ; Canon, Richard
Author_Institution :
SUNY Binghamton, Binghamton, NY, USA
fYear :
2012
fDate :
8-12 Oct. 2012
Firstpage :
1
Lastpage :
8
Abstract :
MapReduce has since its inception been steadily gaining ground in various scientific disciplines ranging from space exploration to protein folding. The model poses a challenge for a wide range of current and legacy scientific applications for addressing their “Big Data” challenges. For example: MapRe-duce´s best known implementation, Apache Hadoop, only offers native support for Java applications. While Hadoop streaming supports applications compiled in a variety of languages such as C, C++, Python and FORTRAN, streaming has shown to be a less efficient MapReduce alternative in terms of performance, and effectiveness. Additionally, Hadoop streaming offers lesser options than its native counterpart, and as such offers less flexibility along with a limited array of features for scientific software. The Hadoop File System (HDFS), a central pillar of Apache Hadoop is not a POSIX compliant file system. In this paper, we present an alternative framework to Hadoop streaming to address the needs of scientific applications: MARISSA (MApReduce Implementation for Streaming Science Applications). We describe MARISSA´s design and explain how it expands the scientific applications that can benefit from the MapReduce model. We also compare and explain the performance gains of MARISSA over Hadoop streaming.
Keywords :
C++ language; Java; distributed processing; Apache Hadoop; C languages; C++ languages; FORTRAN languages; HDFS; Hadoop file system; Hadoop streaming; Java applications; MARISSA; POSIX compliant file system; big data; mapreduce implementation for streaming science applications; protein folding; space exploration; Arrays; Data models; Fault tolerance; Fault tolerant systems; File systems; Java; Peer to peer computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
E-Science (e-Science), 2012 IEEE 8th International Conference on
Conference_Location :
Chicago, IL
Print_ISBN :
978-1-4673-4467-8
Type :
conf
DOI :
10.1109/eScience.2012.6404432
Filename :
6404432
Link To Document :
بازگشت