Title :
Inference of binding sites with a Bayesian multiple-instance motif discovery method
Author :
Jajamovich, Guido H. ; Samoilov, Michael S. ; Wang, Xiaodong ; Arkin, Adam P.
Author_Institution :
Electr. Eng. Dept., Columbia Univ., New York, NY, USA
fDate :
Sept. 29 2010-Oct. 1 2010
Abstract :
We present a Bayesian motif discovery (BMD) algorithm for detecting an unknown number of instances of a motif in a given set of sequences. The algorithm models a motif with a position weight matrix (PWM), which is estimated along with the motif discovery process. This technique is flexible enough to enable other discovery algorithms´ results to be used as input. The method is based on a sequential Monte Carlo algorithm, where the state to be estimated consists of the number of instances in each sequence and their initial positions. The accuracy of the proposed method is compared with other profile-based discovery algorithms. BMD is shown to perform statistically better than MEME and BioProspector in applications ranging from synthetic data to genomic motif finding of Din serine recombinases. In the case of site-specific recombinase target discovery, BMD-inferred motif is found to be the only functionally accurate from the underlying biochemical mechanism standpoint.
Keywords :
Bayes methods; Monte Carlo methods; biochemistry; biology computing; matrix algebra; Bayesian motif discovery algorithm; Bayesian multiple-instance motif discovery method; BioProspector; Din serine recombinase; MEME; binding sites; biochemical mechanism standpoint; genomic motif; motif discovery process; position weight matrix; profile based discovery; sequential Monte Carlo algorithm; site-specific recombinase target discovery; synthetic data; unknown number; Bayesian methods; Bioinformatics; Databases; Genomics; Hidden Markov models; Monte Carlo methods; Pulse width modulation; Motif Discovery; Sequential Monte Carlo Method;
Conference_Titel :
Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on
Conference_Location :
Allerton, IL
Print_ISBN :
978-1-4244-8215-3
DOI :
10.1109/ALLERTON.2010.5706946