Title :
Big data analysis using Hadoop cluster
Author :
Saldhi, Ankita ; Yadav, Dipesh ; Saksena, Dhruv ; Goel, Abhinav ; Saldhi, Ankur ; Indu, S.
Author_Institution :
Centre for Development of Telematics, Mehrauli, Mandi road, Delhi-110030, India
Abstract :
Industries keep a check on all statistics of their business and process this data using various data mining techniques to measure profit trends, revenue, growing markets and interesting opportunities to invest. These statistical records keep on increasing and increase very fast. Unfortunately, as the data grows it becomes a tedious task to process such a large data set and extract meaningful information. Also if the data generated is in various formats, its processing possesses new challenges. Owing to its size, big data is stored in Hadoop Distributed File System (HDFS). In this standard architecture, all the DataNodes function parallel but functioning of a single Data Node is still in sequential fashion. This paper proposes to execute tasks assigned to a single Data Node in parallel instead of executing them sequentially. We propose to use a bunch of streaming multi-processors (SMs) for each single Data Node. An SM can have various processors and memory and all SMs run in parallel and independently. We process big data which may be coming from different sources in different formats to run parallelly on a Hadoop cluster, use the proposed technique and yield desired results efficiently. We have applied proposed methodology to the raw data of an industrial firm, for doing intelligent business, with a final objective of finding profit generated for the firm and its trends throughout a year. We have done analysis over a yearlong data as trends generally repeat after a year.
Keywords :
Big data; Data mining; File systems; Market research; Servers; Big data; Hadoop; Mappers; Reducers; data mining; distributed data processing;
Conference_Titel :
Computational Intelligence and Computing Research (ICCIC), 2014 IEEE International Conference on
Print_ISBN :
978-1-4799-3974-9
DOI :
10.1109/ICCIC.2014.7238418