• DocumentCode
    2214494
  • Title

    Communication and memory optimal parallel data cube construction

  • Author

    Jin, Ruoming ; Yang, Ge ; Vaidyanathan, Karthik ; Agrawal, Gagan

  • Author_Institution
    Dept. of Comput. & Inf. Sci., Ohio State Univ., Columbus, OH
  • fYear
    2003
  • fDate
    9-9 Oct. 2003
  • Firstpage
    573
  • Lastpage
    580
  • Abstract
    Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. We address a number of algorithmic issues in parallel data cube construction. First, we present an aggregation tree for sequential (and parallel) data cube construction, which has minimally bounded memory requirements. An aggregation tree is parameterized by the ordering of dimensions. We present a parallel algorithm based upon the aggregation tree. We analyze the interprocessor communication volume and construct a closed form expression for it. We prove that the same ordering of the dimensions minimizes both the computational and communication requirements. We also describe a method for partitioning the initial array and prove that it minimizes the communication volume. Experimental results from implementation of our algorithms on a cluster of workstations validate our theoretical results
  • Keywords
    computational complexity; data warehouses; minimisation; parallel algorithms; parallel machines; storage management; tree data structures; workstation clusters; aggregation tree; array partitioning; closed form expression; data warehouses; interprocessor communication volume; minimally bounded memory requirements; parallel algorithm; parallel data cube construction; parallel machines; workstation cluster; Aggregates; Clustering algorithms; Companies; Concurrent computing; Data analysis; Data warehouses; Parallel algorithms; Parallel machines; Partitioning algorithms; Performance analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 2003. Proceedings. 2003 International Conference on
  • Conference_Location
    Kaohsiung
  • ISSN
    0190-3918
  • Print_ISBN
    0-7695-2017-0
  • Type

    conf

  • DOI
    10.1109/ICPP.2003.1240625
  • Filename
    1240625