DocumentCode :
699014
Title :
Handling Big Data Efficiently by Using Map Reduce Technique
Author :
Maitrey, Seema ; Jha, C.K.
fYear :
2015
fDate :
13-14 Feb. 2015
Firstpage :
703
Lastpage :
708
Abstract :
Extremely large amount of data is being captured by today´s organizations and is continue to increase. It becomes computationally inefficient to analyze such huge data. Researchers has addressed problem in discovering knowledge from these continuously growing large data sets. Quantity of available raw data has been increasing at a very high rate. The precious information is concealed in large databases. Data mining has become an interesting area to extract the embedded precious information from them. For many years it has been found its root in all kinds of application areas. Thus, gave evolution to many data mining methods which started to get applied in several real life fields. But not all the methods possess the capability to deal with and handle the huge collection of data. In recent years, numbers of computation and data intensive scientific data analyses are established. To perform the large scale data mining analyses so as to meet the scalability and performance requirements of big data, several efficient parallel and concurrent algorithms got applied. A lot of parallel algorithms are put into action using different parallelization techniques, such as-threads, MPI, MapReduce etc. Which yield different performance and usability characteristics. The MPI model works efficiently in computing rigorous problems but it is a complicated task to bring this model into the practical use. There is currently considerable enthusiasm around the MapReduce paradigm for large-scale data analysis. It is inspired by functional programming which allows expressing distributed computations on massive amounts of data. It is designed for large-scale data processing as it allows to run on clusters of commodity hardware. A prominent parallel data processing tool MapReduce is gaining significant momentum from both industry and academia as the volume of data to analyze grows rapidly. In this paper, we are going to work around MapReduce, its advantages, disadvantages and how it can be - sed in integration with other technology.
Keywords :
Big Data; concurrency control; data analysis; data mining; functional programming; parallel algorithms; MPI model; MapReduce paradigm; MapReduce technique; big data handling; concurrent algorithms; data intensive scientific data analysis; data mining methods; distributed computations; functional programming; knowledge discovery; large scale data mining analyses; large-scale data analysis; large-scale data processing; parallel algorithms; parallel data processing tool; parallelization techniques; usability characteristics; Big data; Data mining; Fault tolerance; Fault tolerant systems; Google; Radiation detectors; Clustering; DBMS; Data Mining; Hadoop; MapReduce; Parallel processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence & Communication Technology (CICT), 2015 IEEE International Conference on
Conference_Location :
Ghaziabad
Print_ISBN :
978-1-4799-6022-4
Type :
conf
DOI :
10.1109/CICT.2015.140
Filename :
7078794
Link To Document :
بازگشت