Title :
Comparison of Map-Reduce and SQL on Large-Scale Data Processing
Author :
Leu, Jenq-Shiou ; Yee, Yun-Sun ; Chen, Wa-Lin
Author_Institution :
Dept. of Electron. Eng., Nat. Taiwan Univ. of Sci. & Technol., Taipei, Taiwan
Abstract :
Popularity for the term `Cloud-Computing´ has been increasing in recent years. There are many great companies such as Yahoo, Google etc. tried to provide related services to business community, even through public users. In addition to the SQL technique, Map-Reduce, a programming model that realizes implementing large-scale data processing, has been a hot topic that is widely discussed through many studies. Many real-world tasks such as data processing for search engines can be parallel-implemented through a simple interface with two functions called Map and Reduce. In this paper, we focus on comparing the performance of the Hadoop implementation of Map-Reduce with SQL Server though simulations. In our studies, Hadoop can complete the same query faster than a SQL Server. On the other hand, some concerned factors are also tested to see whether they would affect the performance for Hadoop or not. We also find that more machines included for data processing can make Hadoop achieve a better performance, especially for a large-scale data set.
Keywords :
Internet; SQL; distributed processing; relational databases; Hadoop implementation; MapReduce programming model; Structured Query Language; cloud computing; large-scale data processing; Business; Cloud computing; Computational modeling; Data processing; Google; Programming; Servers; Cloud-Computing; Hadoop; Map-Reduce; SQL;
Conference_Titel :
Parallel and Distributed Processing with Applications (ISPA), 2010 International Symposium on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-8095-1
Electronic_ISBN :
978-0-7695-4190-7
DOI :
10.1109/ISPA.2010.40