DocumentCode
650636
Title
An Evaluation of Cassandra for Hadoop
Author
Dede, E. ; Sendir, B. ; Kuzlu, P. ; Hartog, J. ; Govindaraju, M.
Author_Institution
Grid & Cloud Comput. Res. Lab., SUNY Binghamton, Binghamton, NY, USA
fYear
2013
fDate
June 28 2013-July 3 2013
Firstpage
494
Lastpage
501
Abstract
In the last decade, the increased use and growth of social media, unconventional web technologies, and mobile applications, have all encouraged development of a new breed of database models. NoSQL data stores target the unstructured data, which by nature is dynamic and a key focus area for "Big Data" research. New generation data can prove costly and unpractical to administer with SQL databases due to lack of structure, high scalability, and elasticity needs. NoSQL data stores such as MongoDB and Cassandra provide a desirable platform for fast and efficient data queries. This leads to increased importance in areas such as cloud applications, e-commerce, social media, bioinformatics, and materials science. In an effort to combine the querying capabilities of conventional database systems and the processing power of the MapReduce model, this paper presents a thorough evaluation of the Cassandra NoSQL database when used in conjunction with the Hadoop MapReduce engine. We characterize the performance for a wide range of representative use cases, and then compare, contrast, and evaluate so that application developers can make informed decisions based upon data size, cluster size, replication factor, and partitioning strategy to meet their performance needs.
Keywords
SQL; distributed databases; pattern clustering; public domain software; relational databases; Big Data research; Cassandra evaluation; Hadoop MapReduce engine; MapReduce model; MongoDB; NoSQL database; Web technologies; cluster size; data querying; data size; database models; mobile applications; partitioning strategy; performance needs; replication factor; representative use cases; social media; Benchmark testing; Data models; Distributed databases; Peer-to-peer computing; Servers; Writing; Cassandra; Distributed Computing; Hadoop; MapReduce; NoSQL;
fLanguage
English
Publisher
ieee
Conference_Titel
Cloud Computing (CLOUD), 2013 IEEE Sixth International Conference on
Conference_Location
Santa Clara, CA
Print_ISBN
978-0-7695-5028-2
Type
conf
DOI
10.1109/CLOUD.2013.31
Filename
6676732
Link To Document