Title :
A cost function for uniformly partitioned UB-trees
Author :
Markl, Volker ; Bayer, R.
Author_Institution :
Bayerisches Forschungszentrum fur Wissensbasierte Syst., Munchen, Germany
Abstract :
Most of the operations of the relational algebra or of SQL (such as projection with duplicate elimination, joins, ordering, group-by and aggregations) are efficiently processed using a sorted stream of tuples. Often, these operations are combined with restrictions in one or several attributes. Previous research has proposed algorithms for efficiently dealing with this kind of query pattern, which is highly relevant with respect to data warehousing, data mining and GIS (geographic information systems). In this paper, we present a cost model that enables a concise estimation of both memory costs and run-time costs for processing queries with restrictions in multiple attributes that may in addition involve a sort operation. Our cost model considers uniformly distributed UB-trees (universal B-trees) with independent dimensions, and it is derived analytically in three steps, starting with a very simple, perfectly idealized partitioning scheme, moving on to imperfect partitioning schemes, and finally evaluating a probabilistically partitioned UB-tree. We also reason to what extent our cost model might be taken into account if the data is not distributed uniformly, by investigating the connection between the cost model and selectivities
Keywords :
SQL; data mining; data warehouses; geographic information systems; query processing; relational algebra; software cost estimation; tree data structures; SQL; attribute restrictions; cost function; cost model; data mining; data warehousing; geographic information systems; idealized partitioning scheme; imperfect partitioning schemes; independent dimensions; memory costs; probabilistically partitioned UB-tree; query patterns; query processing; relational algebra; run-time costs; selectivities; sort operation; sorted tuple stream; uniformly distributed universal B-trees; uniformly partitioned UB-trees; Algebra; Concurrent computing; Cost function; Data mining; Information retrieval; Kernel; Multidimensional systems; Query processing; Runtime; Warehousing;
Conference_Titel :
Database Engineering and Applications Symposium, 2000 International
Conference_Location :
Yokohama
Print_ISBN :
0-7695-0789-1
DOI :
10.1109/IDEAS.2000.880626