• DocumentCode
    700149
  • Title

    Minimum description length based protein secondary structure prediction

  • Author

    Hategan, Andrea ; Tabus, Ioan

  • Author_Institution
    Inst. of Signal Process., Tampere Univ. of Technol., Tampere, Finland
  • fYear
    2008
  • fDate
    25-29 Aug. 2008
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    This paper introduces a new algorithm for predicting the secondary structure of a protein based on the protein´s primary structure, i.e. its amino acid sequence. The problem consists in finding the segmentation of the initial amino acid sequence, where each segment carries the label of a secondary structure, e.g., helix, strand, and coil. Our algorithm is different from other existing probabilistic inference algorithms in that it uses probabilistic models suitable for directly encoding the joint information represented by the pair (amino acid sequence, secondary structure labels), and chooses as winner the secondary structure sequence providing the minimum representation, or description length, in line with the minimum description length principle. An additional benefit of our approach is that we provide not only a secondary structure prediction tool, but also a tool that is able to compress in an efficient manner the joint sequences that define the primary and secondary structure information in proteins. The preliminary results obtained for prediction and compression show a good performance, which is better in certain aspects than that of comparable algorithms.
  • Keywords
    image representation; image segmentation; image sequences; proteins; amino acid sequence; minimum description length; minimum representation; probabilistic inference algorithms; protein primary structure; protein secondary structure prediction; secondary structure labels; secondary structure prediction tool; secondary structure sequence; sequence segmentation; Amino acids; Context; Encoding; Prediction algorithms; Proteins; Signal processing algorithms; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference, 2008 16th European
  • Conference_Location
    Lausanne
  • ISSN
    2219-5491
  • Type

    conf

  • Filename
    7080681