Title :
Big data analysis using Apache Hadoop
Author :
Nandimath, Jyoti ; Banerjee, Ekata ; Patil, Abhijit ; Kakade, Pratima ; Vaidya, Salil ; Chaturvedi, Divyansh
Author_Institution :
Dept. of Comput. Eng., SKNCOE, Pune, India
Abstract :
The paradigm of processing huge datasets has been shifted from centralized architecture to distributed architecture. As the enterprises faced issues of gathering large chunks of data they found that the data cannot be processed using any of the existing centralized architecture solutions. Apart from time constraints, the enterprises faced issues of efficiency, performance and elevated infrastructure cost with the data processing in the centralized environment. With the help of distributed architecture these large organizations were able to overcome the problems of extracting relevant information from a huge data dump. One of the best open source tools used in the market to harness the distributed architecture in order to solve the data processing problems is Apache Hadoop. Using Apache Hadoop´s various components such as data clusters, map-reduce algorithms and distributed processing, we will resolve various location-based complex data problems and provide the relevant information back into the system, thereby increasing the user experience.
Keywords :
data analysis; distributed processing; public domain software; software architecture; Apache Hadoop; big data analysis; centralized architecture; complex data problems; data clusters; data processing; data processing problems; distributed architecture; open source software; open source tools; relevant information; time constraints; Computers; Data handling; Data processing; Data storage systems; Distributed databases; Information management; Big data; Data processing; Hadoop; Map Reduce;
Conference_Titel :
Information Reuse and Integration (IRI), 2013 IEEE 14th International Conference on
Conference_Location :
San Francisco, CA
DOI :
10.1109/IRI.2013.6642536