• DocumentCode
    246304
  • Title

    Sequential Fault Monitoring

  • Author

    Dawei Feng ; Germain, Cecile ; Nauroy, Julien

  • Author_Institution
    LRI, Univ. Paris Sud, Paris, France
  • fYear
    2014
  • fDate
    8-12 Sept. 2014
  • Firstpage
    25
  • Lastpage
    34
  • Abstract
    For large-scale distributed systems, the knowledge component at the core of the MAPE-K loop remains elusive. In the context of end-to-end probing, fault monitoring can be re-casted as an inference problem in the space-time domain. We propose and evaluate Sequential Matrix Factorization (SMF), a fully spatio-temporal method that exploits both the recent advances in matrix factorization for the spatial information and a new heuristics based on historical information. Adaptivity operates at two levels: algorithmically, as the exploration/exploitation trade off is controlled by a self-calibrating parameter, and at the policy level, as active learning is required for the most challenging cases of a real-world dataset.
  • Keywords
    distributed processing; fault diagnosis; fault tolerant computing; learning (artificial intelligence); matrix decomposition; SMF; active learning; exploration-exploitation tradeoff; fully spatio-temporal method; historical information; self-calibrating parameter; sequential fault monitoring; sequential matrix factorization; spatial information; Context; Estimation; Linear programming; Monitoring; Prediction algorithms; Probes; Yttrium; Fault prediction; matrix factorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud and Autonomic Computing (ICCAC), 2014 International Conference on
  • Conference_Location
    London
  • Type

    conf

  • DOI
    10.1109/ICCAC.2014.17
  • Filename
    7024041