Title :
Hadoop-HBase for large-scale data
Author :
Vora, Mehul Nalin
Author_Institution :
Innovation Labs., Tata Consultancy Services (TCS) Ltd., Mumbai, India
Abstract :
Today we are inundated with digital data. Yet we are very poor in managing and processing it. It is becoming increasingly difficult to store and analyze data efficiently and economically via conventional database management tools. Not only that, type of data, appearing in the databases, are also changing. Now a day, binary large objects are a standard integral part of any database. Researchers, all over the globe, are baffling with analysis of these ultra large databases. Apache HBase is one such attempt. HBase is a noSQL distributed database developed on top of Hadoop Distributed File System (HDFS). In this paper, we present an evaluation of hybrid architecture where HDFS contains the non-textual data like images and location of such data is stored in HBase. This hybrid architecture enables faster search and retrieval of the data which is a growing need in any organization who are flooded with data. The paper aims at evaluating the performance of random reads and random writes of data storage location information to HBase and retrieving and storing data in HDFS respectively. We also present a comparative study of HBase-HDFS architecture with MySQL-HDFS architecture.
Keywords :
SQL; distributed databases; information retrieval; Apache HBase; HBase-HDFS architecture; Hadoop distributed file system; Hadoop-HBase; MySQL-HDFS architecture; conventional database management tools; data retrieval; data storage location information; hybrid architecture; large-scale data; noSQL distributed database; nontextual data; organization; random reads; random writes; Computer architecture; Distributed databases; Fault tolerance; File systems; Hardware; Time factors; HBase; HDFS; Hadoop; Map Reduce; distributed storage; large-scale data; noSQL database;
Conference_Titel :
Computer Science and Network Technology (ICCSNT), 2011 International Conference on
Conference_Location :
Harbin
Print_ISBN :
978-1-4577-1586-0
DOI :
10.1109/ICCSNT.2011.6182030