DocumentCode :
19067
Title :
Aggregate Estimation in Hidden Databases with Checkbox Interfaces
Author :
Hui Yan ; Zhiguo Gong ; Nan Zhang ; Tao Huang ; Hua Zhong ; Jun Wei
Author_Institution :
Sch. of Comput. Sci. & Technol., Univ. of Sci. & Technol. of China, Hefei, China
Volume :
27
Issue :
5
fYear :
2015
fDate :
May 1 2015
Firstpage :
1192
Lastpage :
1204
Abstract :
A large number of web data repositories are hidden behind restrictive web interfaces, making it an important challenge to enable data analytics over these hidden web databases. Most existing techniques assume a form-like web interface which consists solely of categorical attributes (or numeric ones that can be discretized). Nonetheless, many real-world web interfaces (of hidden databases) also feature checkbox interfaces-e.g., the specification of a set of desired features, such as A/C, navigation, etc., for a car-search website like Yahoo! Autos. We find that, for the purpose of data analytics, such checkbox-represented attributes differ fundamentally from the categorical/numerical ones that were traditionally studied. In this paper, we address the problem of data analytics over hidden databases with checkbox interfaces. Extensive experiments on both synthetic and real datasets demonstrate the accuracy and efficiency of our proposed algorithms.
Keywords :
Internet; data analysis; database management systems; user interfaces; Web data repositories; car-search Web site; checkbox-represented attributes; data analytics; feature checkbox interfaces; form-like Web interface; hidden Web databases; Accuracy; Aggregates; Data analysis; Databases; Educational institutions; Estimation; Testing; Aggregate Estimation; Checkbox; Hidden Databases; Hidden databases; Weight Adjustment; aggregate estimation; checkbox; weight adjustment;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2014.2365800
Filename :
6940257
Link To Document :
بازگشت