Title :
Design and implementation of parallel statiatical algorithm based on Hadoop´s MapReduce model
Author :
Duan, Songqing ; Wu, Bin ; Wang, Bai ; Yang, Juan
Author_Institution :
Beijing Key Lab. of Intell. Telecommun. Software & Multimedia, Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
The rapid growth of data promotes the development of parallel computing. MapReduce, which is a simplified programming model of distributed parallel computing, is becoming more and more popular. In this paper, we design and implementation of parallel statistical algorithm based on Hadoop´s MapReduce model. The algorithm, which is used to grasp the overall characteristics of massive data, involves the calculation of central tendency, dispersion and distribution tendency. By experiment, we come to the conclusion that the algorithm is suitable for dealing with large-scale data.
Keywords :
parallel algorithms; Hadoop MapReduce model; central tendency calculation; dispersion calculation; distributed parallel computing; distribution tendency calculation; parallel computing; parallel statiatical algorithm; Algorithm design and analysis; Cloud computing; Computational modeling; Dispersion; File systems; Gaussian distribution; Programming; Central Tendency; Dispersion; Distribution Tendency; Hadoop; MapReduce; Parallel Statistical Algorithm;
Conference_Titel :
Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-61284-203-5
DOI :
10.1109/CCIS.2011.6045047