• Title of article

    Percolation of annotation errors through hierarchically structured protein sequence databases

  • Author/Authors

    Gilks، نويسنده , , Walter R. and Audit، نويسنده , , Benjamin and de Angelis، نويسنده , , Daniela and Tsoka، نويسنده , , Sophia and Ouzounis، نويسنده , , Christos A.، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2005
  • Pages
    12
  • From page
    223
  • To page
    234
  • Abstract
    Databases of protein sequences have grown rapidly in recent years as a result of genome sequencing projects. Annotating protein sequences with descriptions of their biological function ideally requires careful experimentation, but this work lags far behind. Instead, biological function is often imputed by copying annotations from similar protein sequences. This gives rise to annotation errors, and more seriously, to chains of misannotation. [Percolation of annotation errors in a database of protein sequences (2002)] developed a probabilistic framework for exploring the consequences of this percolation of errors through protein databases, and applied their theory to a simple database model. Here we apply the theory to hierarchically structured protein sequence databases, and draw conclusions about database quality at different levels of the hierarchy.
  • Keywords
    Annotation errors , Database quality , hierarchical classification , Homology , probability model , Protein database , Protein sequence , Biological function
  • Journal title
    Mathematical Biosciences
  • Serial Year
    2005
  • Journal title
    Mathematical Biosciences
  • Record number

    1588839