DocumentCode :
2710272
Title :
Estimating Aggregates over Multiple Sets
Author :
Cohen, Edith ; Kaplan, Haim
Author_Institution :
AT&T Labs.-Res., Florham Park, NJ
fYear :
2008
fDate :
15-19 Dec. 2008
Firstpage :
761
Lastpage :
766
Abstract :
Many datasets, including market basket data, text or hypertext documents, and measurement data collected in different nodes or time periods, are modeled as a collection of sets over a ground set of (weighted) items. We consider the problem of estimating basic aggregates such as the weight or selectivity of a subpopulation of the items. We extend classic summarization techniques based on sampling to this scenario when we have multiple sets and selection predicates based on membership in particular sets.
Keywords :
document handling; hypermedia; hypertext documents; market basket data; measurement data collected; multiple sets; summarization techniques; Aggregates; Computer science; Content based retrieval; Costs; Data mining; Frequency; Sampling methods; Time measurement; USA Councils; Web pages; approximate query processing; sampling; similarity; sketching;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location :
Pisa
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3502-9
Type :
conf
DOI :
10.1109/ICDM.2008.110
Filename :
4781175
Link To Document :
بازگشت