Title :
On Efficient Processing of Subspace Skyline Queries on High Dimensional Data
Author :
Jin, Wen ; Tung, Anthony K H ; Ester, Martin ; Han, Jiawei
Author_Institution :
Simon Fraser Univ, Burnaby
Abstract :
Recent studies on efficiently answering subspace skyline queries can be separated into two approaches. The first focused on pre-materializing a set of skylines points in various subspaces while the second focus on dynamically answering the queries by using a set of anchors to prune off skyline points through spatial reasoning. Despite effort to compress the pre-materialized subspace skylines through removal of redundancy, the storage space for the first approach remain exponential in the number of dimensions. The query time for the second approach on the other hand also grow substantially for data with higher dimensionality where the pruning power of anchors become much weaker. In this paper, we propose methods for answering subspace skyline query on high dimensional data such that both prematerialization storage and query time can be moderated. We propose novel notions of maximal partial-dominating space, maximal partial-dominated space and the maximal equality space between pairs of skyline objects in the full space and use these concepts as the foundation for answering subspace skyline queries for high dimensional data. Query processing involves mostly simple pruning operations while skyline computation is done only on a small subset of candidate skyline points in the subspace. We also develop a random sampling method to compute the subspace skyline in an on-line fashion. Extensive experiments have been conducted and demonstrated the efficiency and effectiveness of our methods.
Keywords :
data mining; inference mechanisms; query processing; relational databases; high dimensional relational database; knowledge discovery; maximal equality space; maximal partial-dominated space; prematerialization storage space; query processing; random sampling method; spatial reasoning; subspace skyline query answering; Airports; Computer science; Information systems; Internet; Query processing; Relational databases; Sampling methods;
Conference_Titel :
Scientific and Statistical Database Management, 2007. SSBDM '07. 19th International Conference on
Conference_Location :
Banff, Alta.
Print_ISBN :
0-7695-2868-6
Electronic_ISBN :
1551-6393
DOI :
10.1109/SSDBM.2007.20