DocumentCode
17997
Title
How Hadoop Clusters Break
Author
Rabkin, A. ; Katz, Randy H.
Author_Institution
Princeton Univ., Princeton, NJ, USA
Volume
30
Issue
4
fYear
2013
fDate
July-Aug. 2013
Firstpage
88
Lastpage
94
Abstract
This article describes an examination of a sample of several hundred support tickets for the Hadoop ecosystem, a widely used group of big data storage and processing systems; a taxonomy of errors and how they are addressed by supporters; and the misconfigurations that are the dominant cause of failures. Some design "antipatterns" and missing platform features contribute to these problems. Developers can use various methods to build more robust distributed systems, thereby helping users and administrators prevent some of these rough edges.
Keywords
data handling; parallel programming; Hadoop cluster; Hadoop ecosystem; data processing system; data storage system; distributed system; Analytical models; Cluster approximation; Data handling; Data storage systems; Information management; Software development; Software reliability; big data; cloud computing; distributed systems; reliability; system administration;
fLanguage
English
Journal_Title
Software, IEEE
Publisher
ieee
ISSN
0740-7459
Type
jour
DOI
10.1109/MS.2012.73
Filename
6216347
Link To Document