DocumentCode :
3230935
Title :
Cost-Aware Client-Side File Caching for Data-Intensive Applications
Author :
Yaning Huang ; Hai Jin ; Xuanhua Shi ; Song Wu ; Yong Chen
Author_Institution :
Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
Volume :
2
fYear :
2013
fDate :
2-5 Dec. 2013
Firstpage :
248
Lastpage :
251
Abstract :
Parallel and distributed file systems are widely used to provide high throughput in high-performance computing and Cloud computing systems. To increase the parallelism, I/O requests are partitioned into multiple sub-requests (or `flows´) and distributed across different data nodes. The performance of file systems is extremely poor if data nodes have highly unbalanced response time. Client-side caching offers a promising direction for addressing this issue. However, current work has primarily used client-side memory as a read cache and employed a write-through policy which requires synchronous update for every write and significantly under-utilizes the client-side cache when the applications are write-intensive. Realizing that the cost of an I/O request depends on the struggler sub-requests, we propose a cost-aware client-side file caching (CCFC) strategy, that is designed to cache the sub-requests with high I/O cost on the client end. This caching policy enables a new trade-off across write performance, consistency guarantee and cache size dimensions. Using benchmark workloads MADbench2, we evaluate our new cache policy alongside conventional write-through. We find that the proposed CCFC strategy can achieve up to 110% throughput improvement compared to the conventional write-through policies with the same cache size on an 85-node cluster.
Keywords :
cache storage; cloud computing; distributed databases; parallel processing; CCFC strategy; I/O cost; I/O requests; MADbench2; cache size dimensions; caching policy; client-side memory; cloud computing systems; consistency guarantee; cost-aware client-side file caching; data nodes; data-intensive applications; distributed file systems; high-performance computing; parallel file systems; parallelism; read cache; struggler subrequests; synchronous update; system performance; write performance; write-intensive; write-through policies; write-through policy; Analytical models; Benchmark testing; Cloud computing; Distributed databases; Servers; Throughput; Time factors; Cloud computing; Parallel file system; client-side file caching; data-intensive computing; high-performance computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing Technology and Science (CloudCom), 2013 IEEE 5th International Conference on
Conference_Location :
Bristol
Type :
conf
DOI :
10.1109/CloudCom.2013.140
Filename :
6735429
Link To Document :
بازگشت