DocumentCode :
611058
Title :
Supporting a Light-Weight Data Management Layer over HDF5
Author :
Yi Wang ; Yu Su ; Agrawal, Gagan
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
fYear :
2013
fDate :
13-16 May 2013
Firstpage :
335
Lastpage :
342
Abstract :
Scientific simulations are now being performed at finer temporal and spatial scales, leading to an explosion of the output data, and challenges in storing, managing, disseminating, analyzing, and visualizing these datasets. Tools commonly used today for disseminating and visualizing such data have inherent limitations, making it extremely hard to deal with larger datasets. We have developed a light-weight data management tool, which allows server-side sub setting and aggregation on scientific datasets stored in HDF5, one of the most popular scientific data formats. To support a variety of queries efficiently, our tool generates code for hyper slab selector and content-based filtering, and parallelizes selection and aggregation queries efficiently using novel algorithms. Additionally, our tool also supports certain most recent HDF5 features including dimension scale and compound data type. Through extensive evaluation, we show that our system is capable of efficiently supporting a variety of queries, scaling performance by parallelizing the queries, and reducing wide area data transfers through server-side data aggregation. We demonstrate that even for sub setting queries that are directly supported in OPeNDAP, a tool widely used by data dissemination portals, the sequential performance of our system is better.
Keywords :
content management; database management systems; information filtering; query processing; scientific information systems; HDF5; OPeNDAP tool; aggregation query; compound data type; content-based filtering; data dissemination portal; data transfer; dimension scale; hyper slab selector; light-weight data management layer; light-weight data management tool; query subsetting; scientific data format; scientific dataset; scientific simulation; server-side data aggregation; server-side subsetting; Compounds; Data models; Data transfer; Data visualization; Indexes; Layout; Libraries;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on
Conference_Location :
Delft
Print_ISBN :
978-1-4673-6465-2
Type :
conf
DOI :
10.1109/CCGrid.2013.9
Filename :
6546110
Link To Document :
بازگشت