DocumentCode :
3143928
Title :
Top-k keyword search over probabilistic XML data
Author :
Li, Jianxin ; Liu, Chengfei ; Zhou, Rui ; Wang, Wei
Author_Institution :
Swinburne Univ. of Technol., Melbourne, VIC, Australia
fYear :
2011
fDate :
11-16 April 2011
Firstpage :
673
Lastpage :
684
Abstract :
Despite the proliferation of work on XML keyword query, it remains open to support keyword query over probabilistic XML data. Compared with traditional keyword search, it is far more expensive to answer a keyword query over probabilistic XML data due to the consideration of possible world semantics. In this paper, we firstly define the new problem of studying top-k keyword search over probabilistic XML data, which is to retrieve k SLCA results with the k highest probabilities of existence. And then we propose two efficient algorithms. The first algorithm PrStack can find k SLCA results with the k highest probabilities by scanning the relevant keyword nodes only once. To further improve the efficiency, we propose a second algorithm EagerTopK based on a set of pruning properties which can quickly prune unsatisfied SLCA candidates. Finally, we implement the two algorithms and compare their performance with analysis of extensive experimental results.
Keywords :
XML; probability; query processing; EagerTopK algorithm; PrStack algorithm; XML keyword query; k SLCA results; probabilistic XML data; top-k keyword search; Encoding; Equations; Keyword search; Mathematical model; Probabilistic logic; Semantics; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2011 IEEE 27th International Conference on
Conference_Location :
Hannover
ISSN :
1063-6382
Print_ISBN :
978-1-4244-8959-6
Electronic_ISBN :
1063-6382
Type :
conf
DOI :
10.1109/ICDE.2011.5767875
Filename :
5767875
Link To Document :
بازگشت