Title :
Kahuna: Problem diagnosis for Mapreduce-based cloud computing environments
Author :
Tan, Jiaqi ; Pan, Xinghao ; Marinelli, Eugene ; Kavulya, Soila ; Gandhi, Rajeev ; Narasimhan, Priya
Author_Institution :
DSO Nat. Labs., Singapore, Singapore
Abstract :
We present Kahuna, an approach that aims to diagnose performance problems in MapReduce systems. Central to Kahuna´s approach is our insight on peer-similarity, that nodes behave alike in the absence of performance problems, and that a node that behaves differently is the likely culprit of a performance problem. We present applications of Kahuna´s insight in techniques and their algorithms to statistically compare black-box (OS-level performance metrics) and white-box (Hadoop-log statistics) data across the different nodes of a MapReduce cluster, in order to identify the faulty node(s). We also present empirical evidence of our peer-similarity observations from the 4000-processor Yahoo! M45 Hadoop cluster. In addition, we demonstrate Kahuna´s effectiveness through experimental evaluation of two algorithms for a number of reported performance problems, on four different workloads in a 100-node Hadoop cluster running on Amazon´s EC2 infrastructure.
Keywords :
Internet; distributed processing; Hadoop-log statistics; Kahuna; MapReduce-based cloud computing; OS-level performance metrics; Yahoo! M45 Hadoop cluster; peer-similarity; problem diagnosis; Cloud computing; Clustering algorithms; Data mining; Facebook; Fault diagnosis; Large-scale systems; Measurement; Open source software; Peer to peer computing; Statistics;
Conference_Titel :
Network Operations and Management Symposium (NOMS), 2010 IEEE
Conference_Location :
Osaka
Print_ISBN :
978-1-4244-5366-5
Electronic_ISBN :
1542-1201
DOI :
10.1109/NOMS.2010.5488446