Title :
An Empirical Study on Quality Issues of Production Big Data Platform
Author :
Hucheng Zhou ; Jian-Guang Lou ; Hongyu Zhang ; Haibo Lin ; Haoxiang Lin ; Tingting Qin
Author_Institution :
Microsoft Res., Beijing, China
Abstract :
Big Data computing platform has evolved to be a multi-tenant service. The service quality matters because system failure or performance slowdown could adversely affect business and user experience. There is few study in literature on service quality issues of production Big Data computing platform. In this paper, we present an empirical study on the service quality issues of Microsoft ProductA, which is a company-wide multi-tenant Big Data computing platform, serving thousands of customers from hundreds of teams. ProductA has a well-defined incident management process, which helps customers report and mitigate service quality issues on 24/7 basis. This paper explores the common symptom, causes and mitigation of service quality issues in Big Data computing. We conduct an empirical study on 210 real service quality issues in ProductA. Our major findings include (1) 21.0% of escalations are caused by hardware faults; (2) 36.2% are caused by system side defects; (3) 37.2% are due to customer side faults. We also studied the general diagnosis process and the commonly adopted mitigation solutions. Our findings can help improve current development and maintenance practice of Big Data computing platform, and motivate tool support.
Keywords :
Big Data; software quality; Microsoft ProductA; customer side faults; hardware faults; incident management process; multitenant Big Data computing platform; production Big Data computing platform; service quality; system side defects; Big data; Business; Electronic mail; Hardware; Iron; Programming; Software engineering;
Conference_Titel :
Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICSE.2015.130