• DocumentCode
    1885119
  • Title

    A Maintainable Software Architecture for Fast and Modular Bioinformatics Sequence Search

  • Author

    Archuleta, Jeremy ; Tilevich, Eli ; Feng, Wu-chun

  • Author_Institution
    Virginia Tech, Blacksburg
  • fYear
    2007
  • fDate
    2-5 Oct. 2007
  • Firstpage
    144
  • Lastpage
    153
  • Abstract
    Bioinformaticists use the Basic Local Alignment Search Tool (BLAST) to characterize an unknown sequence by comparing it against a database of known sequences, thus detecting evolutionary relationships and biological properties. mpiBLAST is a widely-used, high-performance, open-source parallelization of BLAST that runs on a computer cluster delivering super-linear speedups. However, the Achilles heel of mpiBLAST is its lack of modularity, thus adversely affecting maintainability and extensibility. Alleviating this shortcoming requires an architectural refactoring to improve maintenance and extensibility while preserving high performance. Toward that end, this paper evaluates five different software architectures and details how each satisfies our design objectives. In addition, we introduce a novel approach to using mixin layers to enable mixing-and-matching of modules in constructing sequence-search applications for a variety of high-performance computing systems. Our design, which we call "mixin layers with refined roles", utilizes mixin layers to separate functionality into complementary modules and the refined roles in each layer improve the inherently modular design by precipitating flexible and structured parallel development, a necessity for an open-source application. We believe that this new software architecture for mpiBLAST-2.0 will benefit both the users and developers of the package and that our evaluation of different software architectures will be of value to other software engineers faced with the challenges of creating maintainable and extensible, high-performance, bioinformatics software.
  • Keywords
    biology computing; information retrieval; parallel processing; public domain software; sequences; software architecture; software maintenance; basic local alignment search tool; bioinformatics sequence search; computer cluster; maintainable software architecture; modular design; open-source parallelization; software refactoring; Application software; Bioinformatics; Biology computing; Concurrent computing; Databases; Open source software; Packaging; Software architecture; Software maintenance; Software packages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Maintenance, 2007. ICSM 2007. IEEE International Conference on
  • Conference_Location
    Paris
  • ISSN
    1063-6773
  • Print_ISBN
    978-1-4244-1256-3
  • Electronic_ISBN
    1063-6773
  • Type

    conf

  • DOI
    10.1109/ICSM.2007.4362627
  • Filename
    4362627