DocumentCode :
2731055
Title :
Efficient Top-k Query Evaluation on Probabilistic Data
Author :
Re, Cristina ; Dalvi, N. ; Suciu, Daniel
Author_Institution :
Deptartment of Comput. Sci., Washington Univ., USA
fYear :
2007
fDate :
15-20 April 2007
Firstpage :
886
Lastpage :
895
Abstract :
Modern enterprise applications are forced to deal with unreliable, inconsistent and imprecise information. Probabilistic databases can model such data naturally, but SQL query evaluation on probabilistic databases is difficult: previous approaches have either restricted the SQL queries, or computed approximate probabilities, or did not scale, and it was shown recently that precise query evaluation is theoretically hard. In this paper we describe a novel approach, which computes and ranks efficiently the top-k answers to a SQL query on a probabilistic database. The restriction to top-k answers is natural, since imprecisions in the data often lead to a large number of answers of low quality, and users are interested only in the answers with the highest probabilities. The idea in our algorithm is to run in parallel several Monte-Carlo simulations, one for each candidate answer, and approximate each probability only to the extent needed to compute correctly the top-k answers.
Keywords :
SQL; probability; query processing; SQL queries; SQL query evaluation; Top-k query evaluation; approximate probabilities; enterprise applications; probabilistic data; probabilistic databases; Application software; Cleaning; Computer science; Concurrent computing; Data mining; Databases; Motion pictures; Query processing; Text recognition; Uncertainty;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
Conference_Location :
Istanbul
Print_ISBN :
1-4244-0802-4
Type :
conf
DOI :
10.1109/ICDE.2007.367934
Filename :
4221737
Link To Document :
بازگشت