• DocumentCode
    668115
  • Title

    Distance-aware virtual cluster performance optimization: A hadoop case study

  • Author

    Xinkui Zhao ; Jianwei Yin ; Zuoning Chen ; Xingjian Lu

  • Author_Institution
    Coll. of Comput. Sci., Zhejiang Univ., Hangzhou, China
  • fYear
    2013
  • fDate
    23-27 Sept. 2013
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Cloud computing and big data are becoming two important developing trends in information technology area. However, data-intensive computing has some challenges to work well on virtual machines in cloud computing for virtualized resource competition and complex network communication. Network becomes one of the most notorious bottlenecks, which highlights strategies to lower communication and transmission cost in virtual cluster. In this paper, we present a novel cluster performance optimization strategy named vClusterOpt. vClusterOpt finds out centralized subgraphs of node graph and choose node with the shortest logical distance as kernel node of the subgraph to reduce inter-machine communication and transmission cost under virtual cluster. To calculate logical distance accurately, we define two kinds of logical distance: Logical Communication Distance(LCD) and Logical Transmission Distance(LTD). VM with the shortest LCD with others is used as the communication kernel node who has the most information communication stress, while VM with the shortest LTD is treated as transmission kernel node who has the most data transmission stress. We choose benchmarks running on Hadoop as the represent of data-intensive computing service to demonstrate effectiveness of our approach. Experiments show that an average of 20% performance improvement can get by our distance-aware virtual cluster optimization strategy.
  • Keywords
    cloud computing; data handling; graph theory; optimisation; pattern clustering; virtual machines; Hadoop case study; LCD; LTD; VM; big data; centralized subgraphs; cloud computing; cluster performance optimization strategy; complex network communication; data transmission stress; data-intensive computing; distance-aware virtual cluster performance optimization; information technology area; intermachine communication; kernel node; logical communication distance; logical transmission distance; node graph; shortest logical distance; transmission cost; transmission kernel node; vClusterOpt; virtual machines; virtualized resource competition; Cloud computing; Clustering algorithms; Kernel; Optimization; Peer-to-peer computing; Servers; Virtual machining; Hadoop; big data; cloud computing; distance-aware virtual cluster; virtual machine communication;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2013 IEEE International Conference on
  • Conference_Location
    Indianapolis, IN
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2013.6702618
  • Filename
    6702618