Title :
Fault Detection in Distributed Systems by Representative Subspace Mapping
Author :
Chen, Haifeng ; Jiang, Guofei ; Yoshihira, Kenji
Author_Institution :
NEC Labs. America Inc., Princeton, NJ
Abstract :
The high dimensionality of system observation, together with the frequent changes of system normal behavior resulting from workload variations, makes fault detection very difficult in distributed computing systems. This paper addresses these issues by proposing a novel statistical technique, the principal canonical correlation analysis (PCCA), and applying it to monitor the system in a supervised manner. Given a set of input variables u and system measurements x, PCCA extracts a subspace xtilde from x that is not only highly correlated with the input u, but also a significant representative of the whole distribution of x. Such property of PCCA, which combines the strengths of both PCA and CCA, is beneficial to the fault detection task. Experimental results from a real e-commerce system based on the multi-tiered J2EE architecture demonstrate the effectiveness of PCCA
Keywords :
Internet; correlation methods; fault diagnosis; principal component analysis; system monitoring; Internet; distributed computing systems; fault detection; principal canonical correlation analysis; representative subspace mapping; statistical technique; system behavior; Distributed computing; Fault detection; Input variables; Laboratories; Large-scale systems; Monitoring; National electric code; Performance analysis; Principal component analysis; Statistics;
Conference_Titel :
Pattern Recognition, 2006. ICPR 2006. 18th International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2521-0
DOI :
10.1109/ICPR.2006.552