Title :
Online Failure Forecast for Fault-Tolerant Data Stream Processing
Author :
Gu, Xiaohui ; Papadimitriou, Spiros ; Yu, Philip S. ; Chang, Shu-Ping
Author_Institution :
North Carolina State Univ., Raleigh, NC
Abstract :
In this paper, we present a new online failure forecast system to achieve predictive failure management for fault-tolerant data stream processing. Different from previous reactive or proactive approaches, predictive failure management employs failure forecast to perform informed and just-in-time preventive actions on abnormal components only. We employ stream-based online learning methods to continuously classify runtime operator state into normal, alert, or failure, based on collected feature streams. We have implemented the online failure forecast system as part of the IBM system S stream processing system. Our experiments show that the on-line failure forecast system can achieve good prediction accuracy for a range of stream processing software failures, while imposing low overhead to the stream system.
Keywords :
database management systems; failure analysis; fault tolerance; fault-tolerant data stream processing system; online failure forecast; predictive failure management; stream-based online learning method; Accuracy; Application software; Data analysis; Decision trees; Demand forecasting; Fault tolerance; Fault tolerant systems; Learning systems; Runtime; Sensor systems and applications;
Conference_Titel :
Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
Conference_Location :
Cancun
Print_ISBN :
978-1-4244-1836-7
Electronic_ISBN :
978-1-4244-1837-4
DOI :
10.1109/ICDE.2008.4497565