Title :
MapReduce: Limitations, Optimizations and Open Issues
Author :
Kalavri, Vasiliki ; Vlassov, Vladimir
Author_Institution :
KTH R. Inst. of Technol., Stockholm, Sweden
Abstract :
MapReduce has recently gained great popularity as a programming model for processing and analyzing massive data sets and is extensively used by academia and industry. Several implementations of the MapReduce model have emerged, the Apache Hadoop framework being the most widely adopted. Hadoop offers various utilities, such as a distributed file system, job scheduling and resource management capabilities and a Java API for writing applications. Hadoop´s success has intrigued research interest and has led to various modifications and extensions to the framework. Implemented optimizations include performance improvements, programming model extensions, tuning automation and usability enhancements. In this paper, we discuss the current state of the Hadoop framework and its identified limitations. We present, compare and classify Hadoop/MapReduce variations, identify trends, open issues and possible future directions.
Keywords :
Big Data; Java; application program interfaces; parallel programming; public domain software; Apache Hadoop framework; Hadoop variation classification; Java API; MapReduce model; MapReduce variation classification; distributed file system; job scheduling capabilities; massive data set analysis; massive data set processing; performance improvements; programming model; resource management capabilities; tuning automation; usability enhancements; Computational modeling; Data models; Fault tolerance; Indexes; Optimization; Programming; Tuning; Big Data; MapReduce; Survey;
Conference_Titel :
Trust, Security and Privacy in Computing and Communications (TrustCom), 2013 12th IEEE International Conference on
Conference_Location :
Melbourne, VIC
DOI :
10.1109/TrustCom.2013.126