DocumentCode :
180831
Title :
The Communication Complexity of Distributed epsilon-Approximations
Author :
Zengfeng Huang ; Ke Yi
fYear :
2014
fDate :
18-21 Oct. 2014
Firstpage :
591
Lastpage :
600
Abstract :
Data summarization is an effective approach to dealing with the "big data" problem. While data summarization problems traditionally have been studied is the streaming model, the focus is starting to shift to distributed models, as distributed/parallel computation seems to be the only viable way to handle today\´s massive data sets. In this paper, we study ε-approximations, a classical data summary that, intuitively speaking, preserves approximately the density of the underlying data set over a certain range space. We consider the problem of computing ε-approximations for a data set which is held jointly by k players, and give general communication upper and lower bounds that hold for any range space whose discrepancy is known.
Keywords :
Big Data; communication complexity; message passing; Big Data problem; communication complexity; data set density; data summarization problems; data summary; distributed ε-approximations; distributed computation; distributed models; general communication lower bounds; general communication upper bounds; massive data set handling; parallel computation; range space; Approximation methods; Complexity theory; Computational modeling; Data models; Distributed databases; Protocols; Standards; ε-approximations; communication complexity; discrepancy; distributed data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on
Conference_Location :
Philadelphia, PA
ISSN :
0272-5428
Type :
conf
DOI :
10.1109/FOCS.2014.69
Filename :
6979044
Link To Document :
بازگشت