DocumentCode :
3767828
Title :
Big data emerging technologies: A CaseStudy with analyzing twitter data using apache hive
Author :
Aditya Bhardwaj; Vanraj;Ankit Kumar;Yogendra Narayan;Pawan Kumar
Author_Institution :
Computer Science & Engineering Department, National Institute of Technical Teachers Training and Research, Chandigarh, India
fYear :
2015
Firstpage :
1
Lastpage :
6
Abstract :
These are the days of Growth and Innovation for a better future. Now-a-days companies are bound to realize need of Big Data to make decision over complex problem. Big Data is a term that refers to collection of large datasets containing massive amount of data whose size is in the range of Petabytes, Zettabytes, or with high rate of growth, and complexity that make them difficult to process and analyze using conventional database technologies. Big Data is generated from various sources such as social networking sites like Facebook, Twitter etc, and the data that is generated can be in various formats like structured, semi-structured or unstructured format. For extracting valuable information from this huge amount of Data, new tools and techniques is a need of time for the organizations to derive business benefits and to gain competitive advantage over the market. In this paper a comprehensive study of major Big Data emerging technologies by highlighting their important features and how they work, with a comparative study between them is presented. This paper also represents performance analysis of Apache Hive query for executing Twitter tweets in order to calculate Map Reduce CPU time spent and total time taken to finish the job.
Keywords :
"Big data","Twitter","File systems","Computer architecture","Google","Servers","Writing"
Publisher :
ieee
Conference_Titel :
Recent Advances in Engineering & Computational Sciences (RAECS), 2015 2nd International Conference on
Type :
conf
DOI :
10.1109/RAECS.2015.7453400
Filename :
7453400
Link To Document :
بازگشت