• DocumentCode
    614035
  • Title

    Performance Analysis and Optimization of Map Only Left Outer Join

  • Author

    Ming Hao ; Wlodarczyk, T.W. ; Chunming Rong

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of Stavanger, Stavanger, Norway
  • fYear
    2013
  • fDate
    25-28 March 2013
  • Firstpage
    625
  • Lastpage
    631
  • Abstract
    We studied the characteristics of HDFS, Distributed Cache and how algorithms of left outer join on map side had been implemented on the Hadoop platform. For the purpose of performance optimization we inspected several methods to control amount of map task. Further, according to the result of the experiment, we adjusted critical parameters. Based on these we significantly improved performance in comparison to other existing implementations and experiments.
  • Keywords
    cache storage; distributed databases; parallel processing; Hadoop distributed file system; Hadoop platform; distributed cache; left outer join algorithm; map side; map task amount; performance analysis; performance optimization; Algorithm design and analysis; Computer architecture; Conferences; Data processing; Dictionaries; Exponential distribution; Optimization; HDFS; Hadoop; left outer join; map only;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Information Networking and Applications Workshops (WAINA), 2013 27th International Conference on
  • Conference_Location
    Barcelona
  • Print_ISBN
    978-1-4673-6239-9
  • Electronic_ISBN
    978-0-7695-4952-1
  • Type

    conf

  • DOI
    10.1109/WAINA.2013.74
  • Filename
    6550466