• Title of article

    Speeding-up codon analysis on the cloud with local MapReduce aggregation

  • Author/Authors

    Atanas Radenski، نويسنده , , Louis Ehwerhemuepha، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2014
  • Pages
    11
  • From page
    175
  • To page
    185
  • Abstract
    A notable obstacle to higher performance of data-intensive Hadoop MapReduce (MR) bioinformatics algorithms is the large volume of intermediate data that need to be sorted, shuffled, and transmitted between mapper and reducer tasks. This difficulty manifests itself quite clearly in MR codon analysis which is known to generate voluminous intermediate data that create a bottleneck in basic MR codon analysis algorithms. Our proposed approach to handle the intermediate data bottleneck is local in-mapper aggregation (or simply local aggregation), a technique that helps reduce the intermediate data volume between mapper and reducer tasks in MR. We experimentally evaluate the performance of local aggregation (i) by developing codon analysis MR algorithms with and without local aggregation and (ii) by experimentally measuring their performance on Amazon Web Services (AWS), the Amazon cloud platform. Codon analysis with local aggregation maintains consistently high performance with the growth of larger datasets while basic codon analysis, without local aggregation becomes impractically slow even for smaller datasets. Our results can be beneficial (i) to members of the bioinformatics community who need to perform fast and cost-effective nucleotide MR analysis on the cloud and (ii) to computer scientists who strive to increase the performance of MR algorithms.
  • Keywords
    Codon analysis , mapreduce , Local aggregation , CLOUD COMPUTING , Hadoop
  • Journal title
    Information Sciences
  • Serial Year
    2014
  • Journal title
    Information Sciences
  • Record number

    1216060