DocumentCode :
3001828
Title :
MapReduce Framework Optimization via Performance Modeling
Author :
Xu, Lijie
Author_Institution :
Inst. of Software, Beijing, China
fYear :
2012
fDate :
21-25 May 2012
Firstpage :
2506
Lastpage :
2509
Abstract :
MapReduce framework has become the state-of-the-art paradigm for large-scale data processing. In our ongoing work, we attempt to solve the three interrelated problems: how to build an accurate MapReduce performance model, how to use it to automatically detect and optimize slow-running MapReduce jobs, and how to use it to help scheduler arrange job execution sequence. Currently, we mainly study the job execution time model and its training method. We also present several policies to optimize the job configuration and scheduler.
Keywords :
data analysis; distributed processing; public domain software; scheduling; Apache Hadoop; MapReduce framework optimization; MapReduce performance model; job configuration; job execution sequence; job execution time model; job scheduler; large-scale data analysis; large-scale data processing; open-source implementation; performance modeling; training method; Degradation; Encyclopedias; Optimization; Predictive models; Resource management; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0974-5
Type :
conf
DOI :
10.1109/IPDPSW.2012.313
Filename :
6270880
Link To Document :
بازگشت