• DocumentCode
    3706762
  • Title

    Alignment-Free Sequence Comparison over Hadoop for Computational Biology

  • Author

    Giuseppe Cattaneo;Umberto Ferraro Petrillo;Raffaele Giancarlo;Gianluca Roscigno

  • Author_Institution
    Dipt. di Inf., Univ. degli Studi di Salerno, Fisciano, Italy
  • fYear
    2015
  • Firstpage
    184
  • Lastpage
    192
  • Abstract
    Sequence comparison i.e., The assessment of how similar two biological sequences are to each other, is a fundamental and routine task in Computational Biology and Bioinformatics. Classically, alignment methods are the de facto standard for such an assessment. In fact, considerable research efforts for the development of efficient algorithms, both on classic and parallel architectures, has been carried out in the past 50 years. Due to the growing amount of sequence data being produced, a new class of methods has emerged: Alignment-free methods. Research in this ares has become very intense in the past few years, stimulated by the advent of Next Generation Sequencing technologies, since those new methods are very appealing in terms of computational resources needed and biological relevance. Despite such an effort and in contrast with sequence alignment methods, no systematic investigation of how to take advantage of distributed architectures to speed up alignment-free methods, has taken place. We provide a contribution of that kind, by evaluating the possibility of using the Hadoop distributed framework to speed up the running times of these methods, compared to their original sequential formulation.
  • Keywords
    "Bioinformatics","Computer architecture","Biology","Context","Electronic mail","Sequences","Pattern matching"
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing Workshops (ICPPW), 2015 44th International Conference on
  • ISSN
    1530-2016
  • Type

    conf

  • DOI
    10.1109/ICPPW.2015.28
  • Filename
    7349910