مرکز منطقه ای اطلاع رساني علوم و فناوري - VENU: Orchestrating SSDs in hadoop storage

DocumentCode :

1791557

Title :

VENU: Orchestrating SSDs in hadoop storage

Author :

Krish, K.R. ; Iqbal, M. Safdar ; Butt, Ali R.

fYear :

2014

fDate :

27-30 Oct. 2014

Firstpage :

207

Lastpage :

212

Abstract :

A major obstacle in sustaining high performance and scalability in the Hadoop data processing framework is managing the growing data and the need for very high I/O rates. Solid State Disks (SSDs) are promising and are being employed alongside the slower hard disk drives (HDDs) in emerging storage architectures. However, we observed that SSDs are not always a cost-effective option for all Hadoop workloads, and there is a critical need to identify usecases where SSDs can help. To this end, we present VENU, a dynamic data management system for Hadoop. VENU aims to improve overall I/O throughput via effective use of SSDs as a cache for the slower HDDs, not for all data, but for only the workloads that are expected to benefit from SSDs. In addition, we design placement and retrieval schemes to efficiently use the SSD cache. We evaluate our implementation of VENU on a medium-sized cluster and show that it achieves 11% improvement in application completion times when 10% of the available storage is provided by SSDs.

Keywords :

cache storage; data handling; parallel processing; Hadoop data processing framework; Hadoop storage; I/O throughput; SSD cache; VENU; dynamic data management system; solid state disks; Bandwidth; Benchmark testing; Distributed databases; Performance evaluation; Prefetching; Throughput; Venus;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Big Data (Big Data), 2014 IEEE International Conference on

Conference_Location :

Washington, DC

Type :

conf

DOI :

10.1109/BigData.2014.7004234

Filename :

7004234

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1791557