• DocumentCode
    3365149
  • Title

    Towards systems level prognostics in the Cloud

  • Author

    Deb, Budhaditya ; Shah, Mubarak ; Evans, Steve ; Mehta, Manav ; Gargulak, Anthony ; Lasky, Tom

  • Author_Institution
    GE Global Res., Niskayuna, NY, USA
  • fYear
    2013
  • fDate
    24-27 June 2013
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Many application systems are transforming from device centric architectures to cloud based systems that leverage shared compute resources to reduce cost and maximize reach. These systems require new paradigms to assure availability and quality of service. In this paper, we discuss the challenges in assuring Availability and Quality of Service in a Cloud Based Application System. We propose machine learning techniques for monitoring systems logs to assess the health of the system. A web services data set is employed to show that variety of services can be clustered to different service classes using a k-means clustering scheme. Reliability, Availability, and Serviceability (RAS) logs and Job logs dataset from high performance computing system is employed to show that impending fatal errors in the system can be predicted from the logs using an SVM classifier. These approaches illustrate the feasibility of methods to monitor the systems health and performance of compute resources and hence can be used to manage these systems for high availability and quality of service for critical tasks such as health care monitoring in the cloud.
  • Keywords
    Web services; cloud computing; learning (artificial intelligence); parallel processing; pattern classification; pattern clustering; quality of service; resource allocation; software maintenance; support vector machines; system monitoring; Job logs dataset; RAS logs; SVM classifier; Web services data set; availability assurance; cloud hosted system; cloud-based application system; cost reduction; device centric architectures; error prediction; high performance computing system; k-means clustering scheme; machine learning techniques; quality of service; reliability-availability-and-serviceability logs; resource sharing; service clustering; system health assessment; system health monitoring; system level prognostics; system log monitoring; Availability; Documentation; Kernel; Monitoring; Performance evaluation; Quality of service; Cloud Systems; Systems Prognostics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Prognostics and Health Management (PHM), 2013 IEEE Conference on
  • Conference_Location
    Gaithersburg, MD
  • Print_ISBN
    978-1-4673-5722-7
  • Type

    conf

  • DOI
    10.1109/ICPHM.2013.6621449
  • Filename
    6621449