• DocumentCode
    2724433
  • Title

    Evaluating Protein Motif Significance Measures: A Case Study on Prosite Patterns

  • Author

    Ferreira, Pedro Gabriel ; Azevedo, Paulo J.

  • Author_Institution
    Dept. of Informatics, Minho Univ., Braga
  • fYear
    2007
  • fDate
    March 1 2007-April 5 2007
  • Firstpage
    171
  • Lastpage
    178
  • Abstract
    The existence of preserved subsequences in a set of related protein sequences suggests that they might play a structural and functional role in protein´s mechanisms. Due to its exploratory approach, the mining process tends to deliver a large number of motifs. Therefore it is critical to release methods that identify relevant significant motifs. Many measures of interest and significance have been proposed. However, since motifs have a wide range of applications, how to choose the appropriate significance measures is application dependent. Some measures show consistent results being highly correlated, while others show disagreements. In this paper we review existent measures and study their behavior in order to assist the selection of the most appropriate set of measures. An experimental evaluation of the measures for high quality patterns from the Prosite database is presented
  • Keywords
    biology computing; data mining; proteins; sequences; Prosite database; prosite patterns; protein motif significance measures; protein sequence mining; Computational intelligence; Data mining; Databases; Evolution (biology); Hidden Markov models; Informatics; Particle measurements; Pattern analysis; Protein sequence; Pulse width modulation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    1-4244-0705-2
  • Type

    conf

  • DOI
    10.1109/CIDM.2007.368869
  • Filename
    4221293