DocumentCode :
1174253
Title :
Using datacube aggregates for approximate querying and deviation detection
Author :
Palpanas, Themis ; Koudas, Nick ; Mendelzon, Alberto
Author_Institution :
IBM Thomas J. Watson Res. Center, Hawthorne, NY, USA
Volume :
17
Issue :
11
fYear :
2005
Firstpage :
1465
Lastpage :
1477
Abstract :
Much research has been devoted to the efficient computation of relational aggregations and, specifically, the efficient execution of the datacube operation. In this paper, we consider the inverse problem, that of deriving (approximately) the original data from the aggregates. We motivate this problem in the context of two specific application areas, approximate query answering and data analysis. We propose a framework based on the notion of information entropy that enables us to estimate the original values in a data set, given only aggregated information about it. We then show how approximate queries on the data from which the aggregates were derived can be performed using our framework. We also describe an alternate use of the proposed framework that enables us to identify values that deviate from the underlying data distribution, suitable for data mining purposes. We present a detailed performance study of the algorithms using both real and synthetic data, highlighting the benefits of our approach as well as the efficiency of the proposed solutions. Finally, we evaluate our techniques with a case study on a real data set, which illustrates the applicability of our approach.
Keywords :
data analysis; data mining; data warehouses; maximum entropy methods; query processing; approximate query answering; data analysis; data distribution; data mining; data warehouse; datacube aggregate; deviation detection; information entropy; inverse problem; Aggregates; Cities and towns; Computer Society; Data analysis; Data mining; Decision making; Information analysis; Information entropy; Inverse problems; Marketing and sales; Index Terms- Data warehouse; approximate query answering; datacube; deviation detection.;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2005.187
Filename :
1512033
Link To Document :
بازگشت