DocumentCode :
2397558
Title :
Research and application of distributed parallel search hadoop algorithm
Author :
AiLing Duan
Author_Institution :
Sch. of Inf. Sci. & Eng., Henan Univ. of Technol., Zhengzhou, China
fYear :
2012
fDate :
19-20 May 2012
Firstpage :
2462
Lastpage :
2465
Abstract :
Hadoop is an open source distributed parallel computing platform, which is mainly composed of MapReduce algorithm and a distributed file system. This paper introduces Hadoop and the related technologies, discusses in detail the idea and basic framework of MapReduce algorithm, together with the parallelization method and feasibility regarding the massive data involved in Internet search The paper also puts forward the idea and strategy to use MapReduce for parallel processing of webpage inverted index.
Keywords :
Web services; file organisation; information retrieval; parallel algorithms; public domain software; search problems; Hadoop; Internet search; MapReduce algorithm; Web page inverted index; distributed file system; distributed parallel algorithm; open source computing; parallel processing; Distributed databases; Educational institutions; File systems; Indexes; Internet; Parallel processing; Servers; Hadoop; MapReduce algorithm; inverted index; parallel computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems and Informatics (ICSAI), 2012 International Conference on
Conference_Location :
Yantai
Print_ISBN :
978-1-4673-0198-5
Type :
conf
DOI :
10.1109/ICSAI.2012.6223552
Filename :
6223552
Link To Document :
بازگشت