• DocumentCode
    1682613
  • Title

    A Switch Criterion for Hybrid Datasets Merging on Top of Map Reduce

  • Author

    Ma, Lili ; Liao, Huaming ; He, Yongqiang ; Li, Feng ; Gao, Qiang

  • Author_Institution
    Key Lab. of Net Sci., Chinese Acad. of Sci., Beijing, China
  • fYear
    2009
  • Firstpage
    293
  • Lastpage
    298
  • Abstract
    With MapReducepsilas restricted structure, multi-datasets merging problem, commonly in many data mining applications, cannot be efficiently resolved with MapReduce. This paper proposes a novel hybrid datasets merging algorithm on top of Map Reduce, HDMA. HDMA can help to automatically determine the relatively better one between two methods, DMCM and DPM, which have different effective fields. HDMA retains the advantages of both methods, and it can make good use of the memory of data nodes. Experiments show that HDMA can get best performance in most situations.
  • Keywords
    data mining; data structures; merging; HDMA; MapReduce restricted structure; data mining; data node; hybrid dataset; multidataset merging problem; switch criterion; Application software; Computers; Costs; Data mining; Grid computing; Laboratories; Merging; Partitioning algorithms; Switches; Uniform resource locators; DMCM; DPM; Hash table; MapReduce; auto-tuning; memory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Grid and Cooperative Computing, 2009. GCC '09. Eighth International Conference on
  • Conference_Location
    Lanzhou, Gansu
  • Print_ISBN
    978-0-7695-3766-5
  • Type

    conf

  • DOI
    10.1109/GCC.2009.28
  • Filename
    5279579