• DocumentCode
    2420773
  • Title

    Inference of binding sites with a Bayesian multiple-instance motif discovery method

  • Author

    Jajamovich, Guido H. ; Samoilov, Michael S. ; Wang, Xiaodong ; Arkin, Adam P.

  • Author_Institution
    Electr. Eng. Dept., Columbia Univ., New York, NY, USA
  • fYear
    2010
  • fDate
    Sept. 29 2010-Oct. 1 2010
  • Firstpage
    487
  • Lastpage
    494
  • Abstract
    We present a Bayesian motif discovery (BMD) algorithm for detecting an unknown number of instances of a motif in a given set of sequences. The algorithm models a motif with a position weight matrix (PWM), which is estimated along with the motif discovery process. This technique is flexible enough to enable other discovery algorithms´ results to be used as input. The method is based on a sequential Monte Carlo algorithm, where the state to be estimated consists of the number of instances in each sequence and their initial positions. The accuracy of the proposed method is compared with other profile-based discovery algorithms. BMD is shown to perform statistically better than MEME and BioProspector in applications ranging from synthetic data to genomic motif finding of Din serine recombinases. In the case of site-specific recombinase target discovery, BMD-inferred motif is found to be the only functionally accurate from the underlying biochemical mechanism standpoint.
  • Keywords
    Bayes methods; Monte Carlo methods; biochemistry; biology computing; matrix algebra; Bayesian motif discovery algorithm; Bayesian multiple-instance motif discovery method; BioProspector; Din serine recombinase; MEME; binding sites; biochemical mechanism standpoint; genomic motif; motif discovery process; position weight matrix; profile based discovery; sequential Monte Carlo algorithm; site-specific recombinase target discovery; synthetic data; unknown number; Bayesian methods; Bioinformatics; Databases; Genomics; Hidden Markov models; Monte Carlo methods; Pulse width modulation; Motif Discovery; Sequential Monte Carlo Method;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on
  • Conference_Location
    Allerton, IL
  • Print_ISBN
    978-1-4244-8215-3
  • Type

    conf

  • DOI
    10.1109/ALLERTON.2010.5706946
  • Filename
    5706946