• DocumentCode
    1854376
  • Title

    Adaptive Disk I/O Scheduling for MapReduce in Virtualized Environment

  • Author

    Ibrahim, Shadi ; Jin, Hai ; Lu, Lu ; He, Bingsheng ; Wu, Song

  • Author_Institution
    Cluster & Grid Comput. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
  • fYear
    2011
  • fDate
    13-16 Sept. 2011
  • Firstpage
    335
  • Lastpage
    344
  • Abstract
    Virtual machine (VM) interference has long been a challenging problem for performance predictability and system throughput for large-scale virtualized environments in the cloud. Such interferences are contributed by intertwined factors including the application´s type, the number of con current VMs, and the VM scheduling algorithms used within the host. Since MapReduce has become an important data processing platform in the cloud, we investigate the impact of disk schedulers in Hadoop. Interestingly, our experimental results report a noticeable variation of the Hadoop performance between different applications when applying different disk pairs´ schedulers in both the hypervisor and the virtual machines. Furthermore, a typical Hadoop application consists of different interleaving stages, each requiring different I/O workloads and patterns. As a result, the disk pairs´ schedulers are not only sub-optimal for different MapReduce applications, but also sub-optimal for different sub-phases of the whole job. Accordingly, this paper presents a novel approach for adaptively tuning the disk pairs´ schedulers in both the hypervisor and the virtual machines during the execution of a single MapReduce job. Our results show that MapReduce performance can be significantly improved; specifically, adaptive tuning of disk pairs´ schedulers achieves a 25% performance improvement on a sort benchmark with Hadoop.
  • Keywords
    cloud computing; processor scheduling; virtual machines; virtualisation; Hadoop application; MapReduce applications; MapReduce job execution; VM scheduling algorithms; adaptive disk I/O scheduling; adaptive disk pair scheduler tuning; adaptive tuning; data processing platform; virtual machine interference; virtualized environment; Benchmark testing; Interference; Switches; Throughput; Tuning; Virtual machine monitors; Virtual machining; Disk I/O Scheduler; Hadoop; MapReduce; Meta-Scheduler; Virtual Machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing (ICPP), 2011 International Conference on
  • Conference_Location
    Taipei City
  • ISSN
    0190-3918
  • Print_ISBN
    978-1-4577-1336-1
  • Electronic_ISBN
    0190-3918
  • Type

    conf

  • DOI
    10.1109/ICPP.2011.86
  • Filename
    6047047