Title :
Intelligent and Efficient Web-Based Middleware for Data Management in Motif Finding in Gene Regulation
Author :
Hussain, Shiraz ; Adeogun, Samuel ; Dotu, Bright ; Yang, L.T.
Author_Institution :
Comput. Sci., Fisk Univ., Nashville, TN, USA
Abstract :
Due to the latest developments in next generation sequencing and bioinformatics tools, efficient data management is a major challenge in bioinformatics related research. While the availability of biological data facilitates research in genetics and genomics and there are databases where researchers can access these data, downloading and managing data has been a daring task for individuals with little or no programming expertise. The discovery of the right motif in a DNA sequence has remained a challenge and important problem in the field of bioinformatics and regulatory genomics. While, a number of motif finding algorithms and applications have been developed over the years, the Gibbs sampling algorithm has shown great promise when it comes to discovering motifs in the promoter regions of genes. However Gibbs sampling algorithm does not extend to the processing of biological data obtained directly from laboratory experiments and stored in spread sheet applications, instead it focuses exclusively on processing of data already expressed as DNA sequences. In this work, we develop a middleware that assists researchers to download and manage data from public databases like the National Center for Biotechnology Information (NCBI) databases. This is made possible using Biopython - an open-source Python programming language module. A customized web-based interface is developed in order to assist non-programming scientists to easier access these biological databases. In the proposed middleware, HTML, CSS and PHP technologies are used for the front-end web-based interface and Python is used for the backend data processing. The proposed middleware was used in investigating motifs for gene regulation, where large number of DNA sequences were downloaded from NCBI and analyzed using customized and open-source motif finding tools such as Gibbs Sampler, Meme Chip, Align Ace and an in-house modified Gibbs Sampler implementation. Our web-based middleware was very effective in automa- ing data management, it reduced the overall data download overhead and improved management time, from several days to only a few hours.
Keywords :
DNA; Internet; bioinformatics; hypermedia markup languages; middleware; public domain software; user interfaces; Align Ace; Biopython; CSS; DNA sequences; HTML; Meme Chip; NCBI databases; National Center for Biotechnology Information; PHP; backend data processing; bioinformatics; biological databases; customized Web-based interface; data download overhead reduction; front-end Web-based interface; gene regulation; in-house modified Gibbs Sampler; intelligent Web-based middleware; motif finding algorithms; open-source Python programming language module; open-source motif finding tools; public databases; Bioinformatics; DNA; Databases; Genomics; Mathematical model; Middleware; bioinformatics; gene regulation; middleware; motif;
Conference_Titel :
Computational Science and Engineering (CSE), 2013 IEEE 16th International Conference on
Conference_Location :
Sydney, NSW
DOI :
10.1109/CSE.2013.39