Towards Elasticity in Distributed File Systems

Author

Seguin, Cyril ; Le Mahec, Gael ; Depardon, Benjamin

Author_Institution

MIS Lab., UPJV, Villeurbanne, France

fYear

2015

fDate

4-7 May 2015

Firstpage

1047

Lastpage

1056

Abstract

In IaaS Cloud Computing platforms, elasticity offers to users the possibility to adjust the number of resources tithe current workload, taking into account peak (high activity) and trough (low activity) periods by powering down/up someresources. This elasticity principally consists in dynamically starting/stopping Virtual Machines to increase/reduce the computing capacities. In this paper, we study the problem of storage resources elasticity using a Distributed File System (DFS). Indeed, the main DFSes that are used today, like Glister, HDFS or Lustre, focus on data access performance and transient fault tolerance, but they do not take into account intentional and dynamic removal/addition of resources. We propose an algorithm that provides an initial data placement and adapts the resources of a DFS to the workload of a platform maintaining good data access performance. This algorithm takes into account an estimated popularity of the data stored on the DFS to distribute the workload using the least resources. Our simulations on top of SimGrid show that we could improve HDFS´ performances by up to 41% while using less resources (up to 63%). Then, we introduce our solutions that adapt the replication factor of each data to their popularity, allowing more parallel accesses when needed and easing the removal of resources when some data have few concurrent accesses. The simulations show we could either reduce the number of used resources (up to 35%) or increase performances (up to 54%).

Keywords

cloud computing; distributed databases; fault tolerant computing; resource allocation; virtual machines; Gluster; HDFS; IaaS cloud computing platforms; Lustre; SimGrid; computing capacities; data access performance; data placement; distributed file systems; dynamic resource removal-addition; intentional resource removal-addition; parallel accesses; replication factor; transient fault tolerance; virtual machines; Bandwidth; Cloud computing; Distributed databases; Elasticity; Fault tolerance; Rabbits; Servers; Data Popularity; Distributed File System; Elasticity; Read Performances;

fLanguage

English

Publisher

ieee

Conference_Titel

Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on

Conference_Location

Shenzhen

Type

conf

DOI

10.1109/CCGrid.2015.140

Filename

7152591