• DocumentCode
    249450
  • Title

    Performance Implications of SSDs in Virtualized Hadoop Clusters

  • Author

    Sungyong Ahn ; Sangkyu Park ; Jae-Ki Hong ; Wooseok Chang

  • Author_Institution
    DS Software R&D Center, Samsung Electron. Co. Ltd., Suwon, South Korea
  • fYear
    2014
  • fDate
    June 27 2014-July 2 2014
  • Firstpage
    586
  • Lastpage
    593
  • Abstract
    BigData manipulates a massive volume of data for which the traditional techniques are not effective. Apache Hadoop is currently a most popular software framework supporting BigData analysis. As the scale of Hadoop cluster grows larger, building Hadoop clusters in virtualized environment draws a great attention. However, the performance optimization of Hadoop cluster in virtualized environment is difficult because of the virtualization overhead. In this paper the performance implications of SSDs in virtualized Hadoop clusters is identified and the overhead of virtualization is shown to be minimized with SSDs. The study presented in this paper reveals that the main virtualization overhead is I/O bottleneck due to fragmented and randomized I/O workload aggravated by virtualization. However, SSDs are more tolerable to the workload than HDDs. As a result, the virtualization overhead with SSDs is much less than with HDDs. Also, in the case of SSDs, the virtualized Hadoop cluster sustains good performance regardless of the number of VMs.
  • Keywords
    Big Data; cloud computing; data analysis; input-output programs; public domain software; software performance evaluation; storage management; virtualisation; BigData analysis; I-O bottleneck; I-O workload; SSD; cloud computing; performance implications; software framework; solid-state drives; virtualization overhead; virtualized Apache Hadoop clusters; Bandwidth; Benchmark testing; Degradation; Performance evaluation; Servers; Virtual machine monitors; Virtualization; BigData; Cloud computing; Hadoop; SSD; Virtualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data (BigData Congress), 2014 IEEE International Congress on
  • Conference_Location
    Anchorage, AK
  • Print_ISBN
    978-1-4799-5056-0
  • Type

    conf

  • DOI
    10.1109/BigData.Congress.2014.90
  • Filename
    6906832