Title :
A Maintainable Software Architecture for Fast and Modular Bioinformatics Sequence Search
Author :
Archuleta, Jeremy ; Tilevich, Eli ; Feng, Wu-chun
Author_Institution :
Virginia Tech, Blacksburg
Abstract :
Bioinformaticists use the Basic Local Alignment Search Tool (BLAST) to characterize an unknown sequence by comparing it against a database of known sequences, thus detecting evolutionary relationships and biological properties. mpiBLAST is a widely-used, high-performance, open-source parallelization of BLAST that runs on a computer cluster delivering super-linear speedups. However, the Achilles heel of mpiBLAST is its lack of modularity, thus adversely affecting maintainability and extensibility. Alleviating this shortcoming requires an architectural refactoring to improve maintenance and extensibility while preserving high performance. Toward that end, this paper evaluates five different software architectures and details how each satisfies our design objectives. In addition, we introduce a novel approach to using mixin layers to enable mixing-and-matching of modules in constructing sequence-search applications for a variety of high-performance computing systems. Our design, which we call "mixin layers with refined roles", utilizes mixin layers to separate functionality into complementary modules and the refined roles in each layer improve the inherently modular design by precipitating flexible and structured parallel development, a necessity for an open-source application. We believe that this new software architecture for mpiBLAST-2.0 will benefit both the users and developers of the package and that our evaluation of different software architectures will be of value to other software engineers faced with the challenges of creating maintainable and extensible, high-performance, bioinformatics software.
Keywords :
biology computing; information retrieval; parallel processing; public domain software; sequences; software architecture; software maintenance; basic local alignment search tool; bioinformatics sequence search; computer cluster; maintainable software architecture; modular design; open-source parallelization; software refactoring; Application software; Bioinformatics; Biology computing; Concurrent computing; Databases; Open source software; Packaging; Software architecture; Software maintenance; Software packages;
Conference_Titel :
Software Maintenance, 2007. ICSM 2007. IEEE International Conference on
Conference_Location :
Paris
Print_ISBN :
978-1-4244-1256-3
Electronic_ISBN :
1063-6773
DOI :
10.1109/ICSM.2007.4362627