• DocumentCode
    260804
  • Title

    A micropartitioning technique for massive data analysis using MapReduce

  • Author

    Mohanapriya, S. ; Natesan, P.

  • Author_Institution
    Dept. of CSE, Kongu Eng. Coll., Erode, India
  • fYear
    2014
  • fDate
    27-28 Feb. 2014
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Over the past years, large amounts of structured and unstructured data are being collected from various sources. These huge amounts of data are difficult to handle by a single machine which requires the work to be distributed across large number of computers. Hadoop is one such distributed framework which process data in distributed manner by using Mapreduce programming model. In order for Mapreduce to work, it has to divide the workload across the machines in the cluster. The performance of Mapreduce depends on how evenly it distributes the workload to the machines without skew and avoids executing job in a poorly running node called straggler. The workload distribution depends on the algorithm that partitions the data. To overcome the problem from skew, an efficient partitioning technique is proposed. The proposed algorithm improves load balancing as well as reduces the memory requirements. Slow running nodes degrade the performance of Mapreduce job. To overcome this problem, a technique called micropartitioning is used that divide the tasks into smaller tasks greater than the number of reducers and are assigned to reducers. Running many small tasks lessens the impact of stragglers, since work that would have been scheduled on slow nodes is only small which can be performed by other idle workers.
  • Keywords
    data analysis; resource allocation; Hadoop; MapReduce programming model; load balancing; massive data analysis; micropartitioning technique; reducers; straggler; workload distribution; Computational modeling; Computers; Data models; File systems; Google; Load management; Programming; Hadoop; MapReduce; Partitioning; Skew; Straggler; TeraSort;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Communication and Embedded Systems (ICICES), 2014 International Conference on
  • Conference_Location
    Chennai
  • Print_ISBN
    978-1-4799-3835-3
  • Type

    conf

  • DOI
    10.1109/ICICES.2014.7033824
  • Filename
    7033824