Title :
Set Predicates in SQL: Enabling Set-Level Comparisons for Dynamically Formed Groups
Author :
Chengkai Li ; Bin He ; Ning Yan ; Safiullah, Muhammad Assad
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Texas at Arlington, Arlington, TX, USA
Abstract :
In data warehousing and OLAP applications, scalar-level predicates in SQL become increasingly inadequate to support a class of operations that require set-level comparison semantics, i.e., comparing a group of tuples with multiple values. Currently, complex SQL queries composed by scalar-level operations are often formed to obtain even very simple set-level semantics. Such queries are not only difficult to write but also challenging for a database engine to optimize, thus can result in costly evaluation. This paper proposes to augment SQL with set predicate, to bring out otherwise obscured set-level semantics. We studied two approaches to processing set predicates-an aggregate function-based approach and a bitmap index-based approach. Moreover, we designed a histogram-based probabilistic method of set predicate selectivity estimation, for optimizing queries with multiple predicates. The experiments verified its accuracy and effectiveness in optimizing queries.
Keywords :
SQL; data mining; data warehouses; query processing; statistical analysis; OLAP applications; SQL; Structured Query Languages; aggregate function-based approach; bitmap index-based approach; data warehousing; database engine; histogram-based probabilistic method; online analytical processing; query optimization; scalar-level operations; scalar-level predicates; set predicate selectivity estimation; set predicates; set-level comparison semantics; set-level comparisons; set-level semantics; Aggregates; Indexes; Query processing; Semantics; Syntactics; Vectors; OLAP; Set predicates; data warehousing; grouping; querying processing and optimization;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2012.156