DocumentCode :
1175921
Title :
Selection of views to materialize in a data warehouse
Author :
Gupta, Himanshu ; Mumick, Inderpal Singh
Author_Institution :
Dept. of Comput. Sci., State Univ. of New York, Stony Brook, NY, USA
Volume :
17
Issue :
1
fYear :
2005
Firstpage :
24
Lastpage :
43
Abstract :
A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The goal is to select an appropriate set of views that minimizes total query response time and the cost of maintaining the selected views, given a limited amount of resource, e.g., materialization time, storage space, etc. In This work, we have developed a theoretical framework for the general problem of selection of views in a data warehouse. We present polynomial-time heuristics for a selection of views to optimize total query response time under a disk-space constraint, for some important special cases of the general data warehouse scenario, viz.: 1) an AND view graph, where each query/view has a unique evaluation, e.g., when a multiple-query optimizer can be used to general a global evaluation plan for the queries, and 2) an OR view graph, in which any view can be computed from any one of its related views, e.g., data cubes. We present proofs showing that the algorithms are guaranteed to provide a solution that is fairly close to (within a constant factor ratio of) the optimal solution. We extend our heuristic to the general AND-OR view graphs. Finally, we address in detail the view-selection problem under the maintenance cost constraint and present provably competitive heuristics.
Keywords :
computational complexity; constraint handling; data mining; data warehouses; decision support systems; graph theory; heuristic programming; minimisation; program verification; query processing; storage management; AND view graph; OLAP queries; OR view graph; constant factor ratio; data cubes; data warehouse materialization; decision support systems; disk-space constraint; materialized views; multiple-query optimizer; polynomial-time heuristics; query response time minimization; view-selection problem; Constraint optimization; Costs; Data mining; Data warehouses; Databases; Delay; Information analysis; Material storage; Polynomials; Query processing; 65; Index Terms- Views; data warehouse; materialization.; view selection;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2005.16
Filename :
1363763
Link To Document :
بازگشت