• DocumentCode
    3717167
  • Title

    TrustMR: Computation integrity assurance system for MapReduce

  • Author

    Huseyin Ulusoy;Murat Kantarcioglu;Erman Pattuk

  • Author_Institution
    The University of Texas at Dallas, 800 W. Campbell Rd, Richardson TX, 75080
  • fYear
    2015
  • Firstpage
    441
  • Lastpage
    450
  • Abstract
    Data and computation integrity is the major concerns for the users of MapReduce systems. Most production-level MapReduce system optimistically assume that all nodes are trustworthy. Yet, even one compromised node can corrupt the integrity of final results generated by the computation. In the literature, this problem is addressed by many different approaches, where some of them proposed to use special-propose hardware by losing the ability to work with commodity machines, some others proposed to inject watermarking patterns by targeting only particular datasets and jobs, and others replicated the whole jobs by incurring huge overheads. In this paper, we propose a new replication-based method, which can achieve very high attack detection rates (e.g., 99.99%) while incurring only one fifth (20%) of the overhead incurred by the other competitive approaches. The method is based on the decomposition of MapReduce computation into smaller pieces (i.e., intermediate result production). A subset of these pieces are selectively generated in the replicated tasks, and this significantly reduces the network transfer of the replicated tasks. Our empirical results show that relatively small number of replicated intermediate results can provide high detection rate while considerably reducing the overhead of replication.
  • Keywords
    "Computers","Watermarking","Hardware","Computational modeling","Big data","Programming"
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/BigData.2015.7363785
  • Filename
    7363785