• DocumentCode
    1963393
  • Title

    The Data Stream Space Complexity of Cascaded Norms

  • Author

    Jayram, T.S. ; Woodruff, David P.

  • fYear
    2009
  • fDate
    25-27 Oct. 2009
  • Firstpage
    765
  • Lastpage
    774
  • Abstract
    We consider the problem of estimating cascaded aggregates over a matrix presented as a sequence of updates in a data stream. A cascaded aggregate P · Q is defined by evaluating aggregate Q repeatedly over each row of the matrix, and then evaluating aggregate P over the resulting vector of values. This problem was introduced by Cormode and Muthukrishnan, PODS, 2005 [CM]. We analyze the space complexity of estimating cascaded norms on an n × d matrix to within a small relative error. Let Lp denote the p-th norm, where p is a non-negative integer. We abbreviate the cascaded norm Lk · Lp by Lk,p. (1) For any constant k ¿ p ¿ 2, we obtain a 1-pass O¿(n1-2/kd1-2/p)-space algorithm for estimating Lk,p. This is optimal up to polylogarithmic factors and resolves an open question of [CM] regarding the space complexity of L4,2. We also obtain 1-pass space-optimal algorithms for estimating L¿,k and Lk,¿. (2) We prove a space lower bound of ¿(n1-1/k) on estimating Lk,0 and Lk,1, resolving an open question due to Indyk, IITK Data Streams Workshop (Problem 8), 2006. We also resolve two more questions of [CM] concerning Lk,2 estimation and block heavy hitter problems. Ganguly, Bansal and Dube (FAW, 2008) claimed an O(1)-space algorithm for estimating Lk,p for any k,p ¿ [0,2]. Our lower bounds show this claim is incorrect.
  • Keywords
    computational complexity; 1-pass space-optimal algorithm; cascaded aggregate; cascaded norms; data stream space complexity; nonnegative integer; polylogarithmic factor; Aggregates; Area measurement; Computer science; Data mining; Explosions; Frequency; Information technology; Internet; Medical services; Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Foundations of Computer Science, 2009. FOCS '09. 50th Annual IEEE Symposium on
  • Conference_Location
    Atlanta, GA
  • ISSN
    0272-5428
  • Print_ISBN
    978-1-4244-5116-6
  • Type

    conf

  • DOI
    10.1109/FOCS.2009.82
  • Filename
    5438578