Title :
A wavelet framework for adapting data cube views for OLAP
Author :
Smith, John R. ; Li, Chung-Sheng ; Jhingran, Anant
Author_Institution :
IBM Thomas J. Watson Res. Center, Hawthorne, NY, USA
fDate :
5/1/2004 12:00:00 AM
Abstract :
This article presents a method for adaptively representing multidimensional data cubes using wavelet view elements in order to more efficiently support data analysis and querying involving aggregations. The proposed method decomposes the data cubes into an indexed hierarchy of wavelet view elements. The view elements differ from traditional data cube cells in that they correspond to partial and residual aggregations of the data cube. The view elements provide highly granular building blocks for synthesizing the aggregated and range-aggregated views of the data cubes. We propose a strategy for selectively materializing alternative sets of view elements based on the patterns of access of views. We present a fast and optimal algorithm for selecting a non-expansive set of wavelet view elements that minimizes the average processing cost for supporting a population of queries of data cube views. We also present a greedy algorithm for allowing the selective materialization of a redundant set of view element sets which, for measured increases in storage capacity, further reduces processing costs. Experiments and analytic results show that the wavelet view element framework performs better in terms of lower processing and storage cost than previous methods that materialize and store redundant views for online analytical processing (OLAP).
Keywords :
data mining; very large databases; wavelet transforms; OLAP; aggregated views; data analysis; data cube views; granular building blocks; greedy algorithm; indexed hierarchy; multidimensional data cubes; multidimensional data management; online analytical processing; optimal algorithm; range-aggregated views; selective materialization; storage capacity; wavelet framework; wavelet view elements; Aggregates; Cost function; Data analysis; Greedy algorithms; Material storage; Multidimensional systems; Performance analysis; Performance gain; Relational databases; Wavelet analysis;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2004.1277817