DocumentCode :
660855
Title :
A Dynamic Replication Mechanism to Reduce Response-Time of I/O Operations in High Performance Computing Clusters
Author :
Khaneghah, Ehsan Mousavi ; Mirtaheri, Seyedeh Leili ; Grandinetti, Lucio ; Memaripour, Amir Saman ; Sharifi, Morteza
Author_Institution :
Center of High Performance Comput. for Parallel & Distrib. Process., Univ. of Calabria, Rende, Italy
fYear :
2013
fDate :
8-14 Sept. 2013
Firstpage :
738
Lastpage :
743
Abstract :
Extraordinary large datasets of high performance computing applications require improvement in existing storage and retrieval mechanisms. Moreover, enlargement of the gap between data processing and I/O operations´ throughput will bound the system performance to storage and retrieval operations and remarkably reduce the overall performance of high performance computing clusters. File replication is a way to improve the performance of I/O operations and increase network utilization by storing several copies of every file. Furthermore, this will lead to a more reliable and fault-tolerant storage cluster. In order to improve the response time of I/O operations, we have proposed a mechanism that estimates the required number of replicas for each file based on its popularity. Besides that, the remaining space of storage cluster is considered in the evaluation of replication factors and the number of replicas is adapted to the storage state. We have implemented the proposed mechanism using HDFS and evaluated it using MapReduce framework. Evaluation results prove its capability to improve the response time of read operations and increase network utilization. Consequently, this mechanism reduces the overall response time of read operations by considering files´ popularity in replication process and adapts the replication factor to the cluster state.
Keywords :
distributed databases; information retrieval systems; input-output programs; parallel processing; HDFS; IO operation response-time; MapReduce framework; cluster state; dynamic replication mechanism; file popularity; high performance computing applications; high performance computing clusters; large datasets; network utilization; read operations; replication process; retrieval mechanisms; storage mechanisms; Bandwidth; High performance computing; History; Reliability; System performance; Throughput; Time factors; Adaptive Storage; Dynamic Replication; File Replication; File Systems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Social Computing (SocialCom), 2013 International Conference on
Conference_Location :
Alexandria, VA
Type :
conf
DOI :
10.1109/SocialCom.2013.110
Filename :
6693407
Link To Document :
بازگشت