• DocumentCode
    1756855
  • Title

    On Representing Protein Folding Patterns Using Non-Linear Parametric Curves

  • Author

    Kasarapu, Parthan ; de la Banda, Maria Garcia ; Konagurthu, Arun S.

  • Author_Institution
    Clayton Sch. of Inf. Technol., Monash Univ., Clayton, VIC, Australia
  • Volume
    11
  • Issue
    6
  • fYear
    2014
  • fDate
    Nov.-Dec. 1 2014
  • Firstpage
    1218
  • Lastpage
    1228
  • Abstract
    Proteins fold into complex three-dimensional shapes. Simplified representations of their shapes are central to rationalise, compare, classify, and interpret protein structures. Traditional methods to abstract protein folding patterns rely on representing their standard secondary structural elements (helices and strands of sheet) using line segments. This results in ignoring a significant proportion of structural information. The motivation of this research is to derive mathematically rigorous and biologically meaningful abstractions of protein folding patterns that maximize the economy of structural description and minimize the loss of structural information. We report on a novel method to describe a protein as a non-overlapping set of parametric three dimensional curves of varying length and complexity. Our approach to this problem is supported by information theory and uses the statistical framework of minimum message length (MML) inference. We demonstrate the effectiveness of our non-linear abstraction to support efficient and effective comparison of protein folding patterns on a large scale.
  • Keywords
    bioinformatics; molecular biophysics; molecular configurations; pattern classification; proteins; statistical analysis; biologically meaningful abstractions; complex three-dimensional shapes; information theory; line segments; mathematically rigorous abstractions; minimum message length inference; nonlinear parametric curves; nonoverlapping set; parametric three-dimensional curves; protein classification; protein folding pattern representation; protein structures; sheet helices; sheet strands; standard secondary structural elements; statistical framework; structural description; structural information loss; Bioinformatics; Computational biology; Encoding; Pattern recognition; Proteins; Receivers; Transmitters; Protein folding patterns; minimum message length; protein abstractions;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2014.2338319
  • Filename
    6853347