Title :
Space-efficient Relative Error Order Sketch over Data Streams
Author :
Zhang, Ying ; Lin, Xuemin ; Xu, Jian ; Korn, Flip ; Wang, Wei
Author_Institution :
University of New South Wales
Abstract :
We consider the problem of continuously maintaining order sketches over data streams with a relative rank error guarantee ∊. Novel space-efficient and one-scan randomised techniques are developed. Our first randomised algorithm can guarantee such a relative error precision ∊ with confidence 1 - delta using O( 1_ in frac{1} {2}2 log 1d log ∊^2N) space, where N is the number of data elements seen so far in a data stream. Then, a new one-scan space compression technique is developed. Combined with the first randomised algorithm, the one-scan space compression technique yields another one-scan randomised algorithm that guarantees the space requirement is O( 1frac{1} { in } log(1frac{1}{ in } log 1begin{gathered} frac{1}{delta } hfill \\ hfill \\ end{gathered} )frac{{log ^{2 + alpha } in N}} {{1 - 1/2^alpha }} (foralpha gt 0) on average while the worst case space remains O( frac{1}{{ in ^2 }}log frac{1} {delta }log in ^2 N). These results are immediately applicable to approximately computing quantiles over data streams with a relative error guarantee in and significantly improve the previous best space bound O( frac{1} {{ in ^3 }}log frac{1}{delta }log N). Our extensive experiment results demonstrate that both techniques can support an on-line computation against high speed data streams.
Keywords :
Computer network management; Computer networks; Data analysis; Data mining; Distributed computing; Error analysis; High-speed networks; Statistical analysis; Statistical distributions; Stock markets;
Conference_Titel :
Data Engineering, 2006. ICDE '06. Proceedings of the 22nd International Conference on
Print_ISBN :
0-7695-2570-9
DOI :
10.1109/ICDE.2006.145