مرکز منطقه ای اطلاع رساني علوم و فناوري - Reducing service failures by failure and workload aware load balancing in SaaS clouds

DocumentCode :

3279256

Title :

Reducing service failures by failure and workload aware load balancing in SaaS clouds

Author :

Roy, Anirban ; Ganesan, Rajeshwari ; Dash, Denver ; Sarkar, Santonu

Author_Institution :

Infosys Labs. Electron. City, Bangalore, India

fYear :

2013

fDate :

24-27 June 2013

Firstpage :

Lastpage :

Abstract :

SLA violations are typically viewed as service failures. If service fails once, it will fail again unless remedial action is taken. In a virtualized environment, a common remedial action is to restart or reboot a virtual machine (VM). In this paper we present, a VM live-migration policy that is aware of SLA threshold violations of workload response time, physical machine (PM) and VM utilization as well as availability violations at the PM and VM. In the migration policy we take into account PM failures and VM (software) failures as well as workload features such as burstiness (coefficient of variation or CoV >1) which calls for caution during the selection of target PM when migrating these workloads. The proposed policy also considers migration of a VM when the utilization of the physical machine hosting the VM approaches its utilization threshold. We propose an algorithm that detects proactive triggers for remedial action, selects a VM (for migration) and also suggests a possible target PM. We show the efficacy of our proposed approach by plotting the decrease in the number of SLA violations in a system using our approach over existing approaches that do not trigger migration in response to non-availability related SLA violations, via discrete event simulation of a relevant case study.

Keywords :

cloud computing; discrete event simulation; resource allocation; software fault tolerance; virtual machines; virtualisation; PM failures; PM utilization; SLA threshold violations; SaaS clouds; VM failures; VM live migration policy; availability violations; discrete event simulation; failure aware load balancing; physical machine utilization; proactive trigger detection; remedial action; service failure reduction; software failures; virtual machine; virtualized environment; workload aware load balancing; workload response time; Availability; Degradation; Preventive maintenance; Random variables; Software as a service; Time factors; VM migration; cloud data center; coefficient of variation; discrete event simulation; failure model; software aging;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Dependable Systems and Networks Workshop (DSN-W), 2013 43rd Annual IEEE/IFIP Conference on

Conference_Location :

Budapest

ISSN :

2325-6648

Type :

conf

DOI :

10.1109/DSNW.2013.6615511

Filename :

6615511

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3279256