DocumentCode
2949067
Title
Applying MapReduce algorithm to performance testing in lexical analysis on HDFS
Author
Joldzic, Ognjen V.
Author_Institution
Fac. of Electr. Eng., Univ. of Banja Luka, Banja Luka, Bosnia-Herzegovina
fYear
2013
fDate
26-28 Nov. 2013
Firstpage
841
Lastpage
844
Abstract
This paper presents an overview of distributed data processing technology, and explores the possibilities and advantages of using this technology in lexical analysis of Cyrillic text. A detailed overview of one of the most widely used framworks for processing large datasets - Apache Hadoop - is presented, along with a recommendation for planning and deployment of such systems. The paper also analyzes results obtained by running lexical analysis programs on a small Hadoop cluster and the effect of various configuration parameters on total execution times of the test programs.
Keywords
Big Data; distributed processing; software performance evaluation; text analysis; Apache Hadoop cluster; Cyrillic text lexical analysis program; HDFS; MapReduce algorithm; configuration parameters; distributed data processing technology; large dataset processing; performance testing; test programs; Algorithm design and analysis; Clustering algorithms; Data handling; Data storage systems; File systems; Information management; Testing; HDFS; MapReduce; big data; distributed processing; lexical analysis; performance;
fLanguage
English
Publisher
ieee
Conference_Titel
Telecommunications Forum (TELFOR), 2013 21st
Conference_Location
Belgrade
Print_ISBN
978-1-4799-1419-7
Type
conf
DOI
10.1109/TELFOR.2013.6716361
Filename
6716361
Link To Document