DocumentCode
2420773
Title
Inference of binding sites with a Bayesian multiple-instance motif discovery method
Author
Jajamovich, Guido H. ; Samoilov, Michael S. ; Wang, Xiaodong ; Arkin, Adam P.
Author_Institution
Electr. Eng. Dept., Columbia Univ., New York, NY, USA
fYear
2010
fDate
Sept. 29 2010-Oct. 1 2010
Firstpage
487
Lastpage
494
Abstract
We present a Bayesian motif discovery (BMD) algorithm for detecting an unknown number of instances of a motif in a given set of sequences. The algorithm models a motif with a position weight matrix (PWM), which is estimated along with the motif discovery process. This technique is flexible enough to enable other discovery algorithms´ results to be used as input. The method is based on a sequential Monte Carlo algorithm, where the state to be estimated consists of the number of instances in each sequence and their initial positions. The accuracy of the proposed method is compared with other profile-based discovery algorithms. BMD is shown to perform statistically better than MEME and BioProspector in applications ranging from synthetic data to genomic motif finding of Din serine recombinases. In the case of site-specific recombinase target discovery, BMD-inferred motif is found to be the only functionally accurate from the underlying biochemical mechanism standpoint.
Keywords
Bayes methods; Monte Carlo methods; biochemistry; biology computing; matrix algebra; Bayesian motif discovery algorithm; Bayesian multiple-instance motif discovery method; BioProspector; Din serine recombinase; MEME; binding sites; biochemical mechanism standpoint; genomic motif; motif discovery process; position weight matrix; profile based discovery; sequential Monte Carlo algorithm; site-specific recombinase target discovery; synthetic data; unknown number; Bayesian methods; Bioinformatics; Databases; Genomics; Hidden Markov models; Monte Carlo methods; Pulse width modulation; Motif Discovery; Sequential Monte Carlo Method;
fLanguage
English
Publisher
ieee
Conference_Titel
Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on
Conference_Location
Allerton, IL
Print_ISBN
978-1-4244-8215-3
Type
conf
DOI
10.1109/ALLERTON.2010.5706946
Filename
5706946
Link To Document