• DocumentCode
    3186186
  • Title

    Cloudflow - A framework for MapReduce pipeline development in Biomedical Research

  • Author

    Forer, Lukas ; Afgan, Enis ; Weissensteiner, Hansi ; Davidovic, Davor ; Specht, Gunther ; Kronenberg, Florian ; Schonherr, Sebastian

  • Author_Institution
    Div. of Genetic Epidemiology, Med. Univ. of Innsbruck, Innsbruck, Austria
  • fYear
    2015
  • fDate
    25-29 May 2015
  • Firstpage
    172
  • Lastpage
    177
  • Abstract
    The data-driven parallelization framework Hadoop MapReduce allows analysing large data sets in a scalable way. Since the development of MapReduce programs can be a time-intensive and challenging task, the application and usage of Hadoop in Biomedical Research is still limited. Here we present Cloudflow, a high-level framework to hide the implementation details of Hadoop and to provide a set of building blocks to create biomedical pipelines in a more intuitive way. We demonstrate the benefit of Cloudflow on three different genetic use cases. It will be shown how the framework can be combined with the Hadoop workflow system Cloudgene and the cloud orchestration platform CloudMan to provide Hadoop pipelines as a service to everyone.
  • Keywords
    cloud computing; data handling; medical computing; parallel processing; CloudMan; Cloudflow; Cloudgene; Hadoop MapReduce; biomedical research; cloud orchestration platform; data-driven parallelization framework; genetic use cases; high-level framework; pipeline development; workflow system; Bioinformatics; Genomics; Information filters; Pipeline processing; Pipelines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2015 38th International Convention on
  • Conference_Location
    Opatija
  • Type

    conf

  • DOI
    10.1109/MIPRO.2015.7160259
  • Filename
    7160259