Title :
Snooze: A Scalable and Autonomic Virtual Machine Management Framework for Private Clouds
Author :
Feller, Eugen ; Rilling, Louis ; Morin, Christine
Author_Institution :
INRIA Centre Rennes, Campus Univ. de Beaulieu, Rennes, France
Abstract :
With the advent of cloud computing and the need to satisfy growing customers resource demands, cloud providers now operate increasing amounts of large data centers. In order to ease the creation of private clouds, several open-source Infrastructure-as-a-Service (IaaS) cloud management frameworks (e.g. Open Nebula, Nimbus, Eucalyptus, Open Stack) have been proposed. However, all these systems are either highly centralized or have limited fault tolerance support. Consequently, they all share common drawbacks: scalability is limited by a single master node and Single Point of Failure (SPOF). In this paper, we present the design, implementation and evaluation of a novel scalable and autonomic (i.e. self-organizing and healing) virtual machine (VM) management framework called Snooze. For scalability the system utilizes a self-organizing hierarchical architecture and performs distributed VM management. Moreover, fault tolerance is provided at all levels of the hierarchy, thus allowing the system to self-heal in case of failures. Our evaluation conducted on 144 physical machines of the Grid´5000 experimental test bed shows that the fault tolerance features of the framework do not impact application performance. Moreover, negligible cost is involved in performing distributed VM management and the system remains highly scalable with increasing amounts of resources.
Keywords :
cloud computing; computer centres; public domain software; self-adjusting systems; software fault tolerance; virtual machines; Grid´5000; IaaS cloud management frameworks; Snooze; autonomic virtual machine management; cloud computing; data centers; fault tolerance; open-source infrastructure-as-a-service cloud management frameworks; private clouds; scalable virtual machine management; self-healing system; self-organizing hierarchical architecture; single master node; single point-of-failure; Cloud computing; Fault tolerance; Fault tolerant systems; Heart beat; IP networks; Monitoring; Scalability; Cloud Computing; Scalability; Self-Healing; Self-Organization; Virtualization;
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2012 12th IEEE/ACM International Symposium on
Conference_Location :
Ottawa, ON
Print_ISBN :
978-1-4673-1395-7
DOI :
10.1109/CCGrid.2012.71