• DocumentCode
    3433792
  • Title

    Visual, Log-Based Causal Tracing for Performance Debugging of MapReduce Systems

  • Author

    Tan, Jiaqi ; Kavulya, Soila ; Gandhi, Rajeev ; Narasimhan, Priya

  • Author_Institution
    DSO Nat. Labs., Singapore, Singapore
  • fYear
    2010
  • fDate
    21-25 June 2010
  • Firstpage
    795
  • Lastpage
    806
  • Abstract
    The distributed nature and large scale of MapReduce programs and systems poses two challenges in using existing profiling and debugging tools to understand MapReduce programs. Existing tools produce too much information because of the large scale of MapReduce programs, and they do not expose program behaviors in terms of Maps and Reduces. We have developed a novel non-intrusive log-analysis technique which extracts state-machine views of the control- and data-flows in MapReduce behavior from the native logs of Hadoop MapReduce systems, and it synthesizes these views to create a unified, causal view of MapReduce program behavior. This technique enables us to visualize MapReduce programs in terms of MapReduce-specific behaviors, aiding operators in reasoning about and debugging performance problems in MapReduce systems. We validate our technique and visualizations using a realworld workload, showing how to understand the structure and performance behavior of MapReduce jobs, and diagnose injected performance problems reproduced from real-world problems.
  • Keywords
    data visualisation; finite state machines; program debugging; MapReduce systems; debugging tools; distributed nature; log based causal tracing; nonintrusive log analysis technique; performance debugging; program behaviors; state-machine extraction; Data mining; Debugging; Distributed computing; Instruments; Java; Laboratories; Large-scale systems; Processor scheduling; Radio access networks; Visualization; Cloud Computing; Distributed Systems; Failure Diagnosis; Visualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems (ICDCS), 2010 IEEE 30th International Conference on
  • Conference_Location
    Genova
  • ISSN
    1063-6927
  • Print_ISBN
    978-1-4244-7261-1
  • Type

    conf

  • DOI
    10.1109/ICDCS.2010.63
  • Filename
    5541622