Title :
SyncSnap: Synchronized Live Memory Snapshots of Virtual Machine Networks
Author :
Bin Shi ; Bo Li ; Lei Cui ; Jieyu Zhao ; Jianxin Li
Author_Institution :
Sch. of Comput. Sci. & Eng., Beihang Univ., Beijing, China
Abstract :
With the prevalence of virtualization technology, virtual machine networks (VMN) have been widely used in host network applications. To provide fault tolerance and non-stop features to network applications and preserve the network consistency among virtual machines (VM), the distributed snapshot technique of virtual machine networks regains the attention of academia. While existing approaches still suffer from long service interruption and performance degradation when taking snapshots. Especially the TCP back off problem, which is due to the inconsistence of the snapshot completion time among VMs, may lead to network packet loss and even crash the connections among virtual machines. In this paper, we present Sync Snap, a system that takes live distributed memory snapshots of virtual machine networks synchronously with only milliseconds of downtime and ensure all the VMs complete snapshots at almost the same time. An adaptive single-VM snapshot approach is proposed to accurately control the snapshot duration through dynamically adjusting the snapshot speed. Furthermore, a synchronization mechanism is designed to ensure that a global consistency state of VMN can be reached by controlling snapshot duration of each VM. We have implemented SyncSnap on QEMU/KVM and performed several experiments to evaluate its effectiveness and efficiency. The experimental results demonstrate that our approach can control the VM snapshot duration to a given value with only tens of milliseconds deviation and reduce TCP back off duration to hundreds milliseconds.
Keywords :
distributed memory systems; fault tolerant computing; synchronisation; virtual machines; virtualisation; QEMU/KVM; SyncSnap; TCP back off problem; VMN; adaptive single-VM snapshot approach; distributed snapshot technique; fault tolerance; host network applications; live distributed memory snapshots; snapshot duration control; synchronization mechanism; synchronized live memory snapshots; virtual machine networks; virtualization technology; Calibration; Process control; Radiation detectors; Real-time systems; Servers; Synchronization; Virtual machining; Live snapshot; TCP backoff duration; automatically adjust; global consistency; virtual machine networks;
Conference_Titel :
High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS), 2014 IEEE Intl Conf on
Print_ISBN :
978-1-4799-6122-1
DOI :
10.1109/HPCC.2014.82