DocumentCode :
141886
Title :
Cinderella — Adaptive online partitioning of irregularly structured data
Author :
Herrmann, Kai ; Voigt, Hannes ; Lehner, Wolfgang
Author_Institution :
Database Technol. Group, Tech. Univ. Dresden, Dresden, Germany
fYear :
2014
fDate :
March 31 2014-April 4 2014
Firstpage :
284
Lastpage :
291
Abstract :
In an increasing number of use cases, databases face the challenge of managing irregularly structured data. Irregularly structured data is characterized by a quickly evolving variety of entities without a common set of attributes. These entities do not show enough regularity to be captured in a traditional database schema. A common solution is to centralize the diverse entities in a universal table. Usually, this leads to a very sparse table. Although today´s techniques allow efficient storage of sparse universal tables, query efficiency is still a problem. Queries that reference only a subset of attributes have to read the whole universal table including many irrelevant entities. One possible solution is to use a partitioning of the table, which allows pruning partitions of irrelevant entities before they are touched. Creating and maintaining such a partitioning manually is very laborious or even infeasible, due to the enormous complexity. Thus an autonomous solution is desirable. In this paper, we define the Online Partitioning Problem for irregularly structured data and present Cinderella. Cinderella is an autonomous online algorithm for horizontal partitioning of irregularly structured entities in universal tables. It is designed to keep its overhead low by incrementally assigning entities to partitions while they are touched anyway during modifications. The achieved partitioning allows queries that retrieve only entities with a subset of attributes easily pruning partitions of irrelevant entities. Cinderella increases the locality of queries and reduces query execution cost.
Keywords :
data handling; data structures; query processing; Cinderella; adaptive online partitioning problem; autonomous online algorithm; database schema; irregularly structured data management; pruning partitions; query efficiency; query execution cost reduction; sparse universal tables; universal table partitioning; Catalogs; Complexity theory; Database systems; Partitioning algorithms; Prototypes; Time measurement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering Workshops (ICDEW), 2014 IEEE 30th International Conference on
Conference_Location :
Chicago, IL
Type :
conf
DOI :
10.1109/ICDEW.2014.6818342
Filename :
6818342
Link To Document :
بازگشت