DocumentCode :
3122974
Title :
Recommending Join Queries via Query Log Analysis
Author :
Yang, Xiaoyan ; Procopiuc, Cecilia M. ; Srivastava, Divesh
Author_Institution :
Nat. Univ. of Singapore, Singapore
fYear :
2009
fDate :
March 29 2009-April 2 2009
Firstpage :
964
Lastpage :
975
Abstract :
Complex ad hoc join queries over enterprise databases are commonly used by business data analysts to understand and analyze a variety of enterprise-wide processes. However, effectively formulating such queries is a challenging task for human users, especially over databases that have large, heterogeneous schemas. In this paper, we propose a novel approach to automatically create join query recommendations based on input-output specifications (i.e.,input tables on which selection conditions are imposed, and output tables whose attribute values must be in the result of the query).The recommended join query graph includes (i) "intermediate\´\´ tables, and (ii) join conditions that connect the input and output tables via the intermediate tables. Our method is based on analyzing an existing query log over the enterprise database. Borrowing from program slicing techniques, which extract parts of a program that affect the value of a given variable, we first extract "query slices\´\´ from each query in the log. Given a user specification, we then re-combine appropriate slices to create a new join query graph, which connects the sets of input and output tables via the intermediate tables. We propose and study several quality measures to enable choosing a good join query graph among the many possibilities. Each measure expresses an intuitive notion that there should be sufficient evidence in the log to support our recommendation of the join query graph. We conduct an extensive study using the log of an actual enterprise database system to demonstrate the viability of our novel approach for recommending join queries.
Keywords :
business data processing; distributed databases; query processing; business data analysts; enterprise databases; heterogeneous schemas; input-output specifications; join query graph; join query recommendations; query log analysis; user specification; Data analysis; Data engineering; Database systems; Humans; Information analysis; Performance analysis; USA Councils;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2009. ICDE '09. IEEE 25th International Conference on
Conference_Location :
Shanghai
ISSN :
1084-4627
Print_ISBN :
978-1-4244-3422-0
Electronic_ISBN :
1084-4627
Type :
conf
DOI :
10.1109/ICDE.2009.122
Filename :
4812469
Link To Document :
بازگشت