Title :
In-memory Query System for Scientific Dataseis
Author :
Hsuan-Te Chiu;Jerry Chou;Venkat Vishwanath;Kesheng Wu
Author_Institution :
Nat. Tsing Hua Univ., Hsinchu, Taiwan
Abstract :
The growing gap between compute performance and I/O bandwidth coupled with the increasing data volumes has resulted in a bottleneck to the traditional post-simulation data processing method. Hence in-situ computing and query-driven data analysis are important techniques to minimize data movement. By taking advantage of the growing memory capacity on supercomputers, we developed an in-memory query system for scientific data analysis. Our approach is a combination of bitmap indexing, spatial data layout re-organization, distributed shared memory, and location-aware parallel execution. Our evaluations using real scientific datasets showed that we can aggregate the memory capacity from thousands of computes nodes to analyze a 750GB simulation dataset without transferring data to remote nodes or storage systems. Comparing to traditional solutions based on out-of-core parallel file systems, we achieve significant higher query performance.
Keywords :
"Indexing","Data analysis","Computational modeling","Arrays","Data models","Analytical models"
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2015 IEEE 21st International Conference on
Electronic_ISBN :
1521-9097
DOI :
10.1109/ICPADS.2015.53