Title : 
Grid Methodology for Identifying Co-Regulated Genes and Transcription Factor Binding Sites
         
        
            Author : 
Van Der Wath, Elizabeth ; Moutsianas, Loukas ; Van Der Wath, Richard ; Visagie, Alet ; Milanesi, Luciano ; Liò, Pietro
         
        
            Author_Institution : 
Comput. Lab., Cambridge Univ.
         
        
        
        
        
            fDate : 
6/1/2007 12:00:00 AM
         
        
        
        
            Abstract : 
The identification of the genes that are coordinately regulated is an important and challenging task of bioinformatics and represents a first step in the elucidation of the topology of transcriptional networks. We first compare the performances, in a grid setting, of the Markov clustering algorithm with respect to the k-means using microarray test data sets. The gene expression information of the clustered genes can be used to annotate transcription binding sites upstream co-regulated genes. The methodology uses a regression model that relates gene expression levels to the matching scores of nucleotide patterns allowing us to identify DNA-binding sites from a collection of noncoding DNA sequences from co-regulated genes. Here we discuss extending the approach to multiple species exploiting the grid framework.
         
        
            Keywords : 
DNA; Markov processes; biology computing; genetics; grid computing; molecular biophysics; regression analysis; DNA-binding sites; Grid methodology; Markov clustering algorithm; bioinformatics; coregulated genes; gene expression; k-means; microarray test data sets; noncoding DNA sequences; nucleotide pattern; regression model; transcription binding sites; transcription factor binding sites; transcriptional networks; Bioinformatics; Clustering algorithms; DNA; Diseases; Gene expression; Network topology; Pattern matching; Performance evaluation; Sequences; Testing; Gene clustering; gene expression; grid computing; microarray; protein binding sites; transcription factors; Algorithms; Binding Sites; Databases, Genetic; Gene Expression Profiling; Gene Expression Regulation; Information Storage and Retrieval; Internet; Multigene Family; Protein Binding; Transcription Factors;
         
        
        
            Journal_Title : 
NanoBioscience, IEEE Transactions on
         
        
        
        
        
            DOI : 
10.1109/TNB.2007.897470