• DocumentCode
    2427938
  • Title

    A Hybrid Join Algorithm on Top of Map Reduce

  • Author

    Hu, Weisong ; Ma, Lili ; Liu, Xiaowei ; Qi, Hongwei ; Zha, Li ; Liao, Huaming ; Zhang, Yuezhou

  • fYear
    2011
  • fDate
    24-26 Oct. 2011
  • Firstpage
    44
  • Lastpage
    50
  • Abstract
    Hadoop has shown great power in processing vast data in parallel. Hive, the database on Hadoop, enables more experts to process relational data by providing sql-like interface. However, Hive does not provide an efficient approach for join, a common but expensive operator in relational database. Due to the importance of join, this paper proposes a novel hybrid algorithm, HJA, which can help to automatically choose the relatively better one among several methods, divide and memory copy merge, Partition Join(PJ) and naïve Hive join. Experiments show that HJA can get best performance in most situations.
  • Keywords
    SQL; parallel processing; relational databases; HJA; Hadoop; MapReduce; Partition Join; SQL-like interface; naive Hive join; relational database; Semantics; Hadoop; MapReduce; auto-tuning; partition join;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantics Knowledge and Grid (SKG), 2011 Seventh International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4577-1323-1
  • Type

    conf

  • DOI
    10.1109/SKG.2011.13
  • Filename
    6088090