DocumentCode :
3772335
Title :
An Elastic Data Persisting Solution with High Performance for Spark
Author :
Zhipeng Jiang;Haopeng Chen;Huan Zhou;Jenny Wu
Author_Institution :
Sch. of Software, Shanghai Jiao Tong Univ., Shanghai, China
fYear :
2015
Firstpage :
656
Lastpage :
661
Abstract :
With the increasing popularity of in-memory computing, Spark [1] has been highly successful in implementing large scale data intensive applications, especially for those that reuse data across multiple parallel operations. However due to the fact that Moore´s Law has slowed down and memory resources are still costly, we presented an elastic data persisting solution for Spark, which enables data compression to save more heap space for JVM and reducing disk I/O throughput for faster data access. We mathematically derived the criteria for selecting the optimal data compression and persisting plan. Our evaluation of the preliminary prototype of this elastic data persisting solution shows that it can provide resource management recommendations by accounting for input data type, memory space and CPU resource, and can consistently yield high performance that accelerates Spark up to 6x.
Keywords :
"Sparks","Data compression","Compression algorithms","Benchmark testing","Memory management","Java","Algorithm design and analysis"
Publisher :
ieee
Conference_Titel :
Smart City/SocialCom/SustainCom (SmartCity), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/SmartCity.2015.144
Filename :
7463798
Link To Document :
بازگشت