DocumentCode
1682613
Title
A Switch Criterion for Hybrid Datasets Merging on Top of Map Reduce
Author
Ma, Lili ; Liao, Huaming ; He, Yongqiang ; Li, Feng ; Gao, Qiang
Author_Institution
Key Lab. of Net Sci., Chinese Acad. of Sci., Beijing, China
fYear
2009
Firstpage
293
Lastpage
298
Abstract
With MapReducepsilas restricted structure, multi-datasets merging problem, commonly in many data mining applications, cannot be efficiently resolved with MapReduce. This paper proposes a novel hybrid datasets merging algorithm on top of Map Reduce, HDMA. HDMA can help to automatically determine the relatively better one between two methods, DMCM and DPM, which have different effective fields. HDMA retains the advantages of both methods, and it can make good use of the memory of data nodes. Experiments show that HDMA can get best performance in most situations.
Keywords
data mining; data structures; merging; HDMA; MapReduce restricted structure; data mining; data node; hybrid dataset; multidataset merging problem; switch criterion; Application software; Computers; Costs; Data mining; Grid computing; Laboratories; Merging; Partitioning algorithms; Switches; Uniform resource locators; DMCM; DPM; Hash table; MapReduce; auto-tuning; memory;
fLanguage
English
Publisher
ieee
Conference_Titel
Grid and Cooperative Computing, 2009. GCC '09. Eighth International Conference on
Conference_Location
Lanzhou, Gansu
Print_ISBN
978-0-7695-3766-5
Type
conf
DOI
10.1109/GCC.2009.28
Filename
5279579
Link To Document