• DocumentCode
    2037300
  • Title

    A fault-tolerant environment for large-scale query processing

  • Author

    Kurt, Mehmet Can ; Agrawal, Gagan

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
  • fYear
    2012
  • fDate
    18-22 Dec. 2012
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    As datasets are increasing in size, the data management and processing needs are being met with added parallelism, i.e, by involving more nodes and/or cores in the system. This, in turn, is increasing the chances of failures during processing. In this paper, we present the design and implementation of a fault-tolerant environment for processing queries on large scientific dataset. Our systems meet the following three requirements that we consider essential for any such environment: 1) high efficiency of execution of a particular data analysis task or query, when there are no failures, 2) ability to handle failure of up to a certain number of nodes, and 3) only a modest slowdown in processing times of data analysis task or a query when there are failures. We address these challenges by developing a new data replication scheme, which we refer to as subchunk or subpartition replication. Our system currently supports two types of queries: range queries on spatial data and aggregation queries on point datasets, but the underlying ideas can be extended to other query types as well. Our extensive evaluation shows that we can handle single and rack failures with only modest slowdowns, and particularly, clearly outperform the traditional (chunk or partition replication) schemes.
  • Keywords
    data analysis; query processing; data analysis; data management; data processing; data replication scheme; fault-tolerant environment; large scale query processing; spatial aggregation queries; spatial data; subchunk replication; subpartition replication;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing (HiPC), 2012 19th International Conference on
  • Conference_Location
    Pune
  • Print_ISBN
    978-1-4673-2372-7
  • Electronic_ISBN
    978-1-4673-2370-3
  • Type

    conf

  • DOI
    10.1109/HiPC.2012.6507487
  • Filename
    6507487