DocumentCode
2724433
Title
Evaluating Protein Motif Significance Measures: A Case Study on Prosite Patterns
Author
Ferreira, Pedro Gabriel ; Azevedo, Paulo J.
Author_Institution
Dept. of Informatics, Minho Univ., Braga
fYear
2007
fDate
March 1 2007-April 5 2007
Firstpage
171
Lastpage
178
Abstract
The existence of preserved subsequences in a set of related protein sequences suggests that they might play a structural and functional role in protein´s mechanisms. Due to its exploratory approach, the mining process tends to deliver a large number of motifs. Therefore it is critical to release methods that identify relevant significant motifs. Many measures of interest and significance have been proposed. However, since motifs have a wide range of applications, how to choose the appropriate significance measures is application dependent. Some measures show consistent results being highly correlated, while others show disagreements. In this paper we review existent measures and study their behavior in order to assist the selection of the most appropriate set of measures. An experimental evaluation of the measures for high quality patterns from the Prosite database is presented
Keywords
biology computing; data mining; proteins; sequences; Prosite database; prosite patterns; protein motif significance measures; protein sequence mining; Computational intelligence; Data mining; Databases; Evolution (biology); Hidden Markov models; Informatics; Particle measurements; Pattern analysis; Protein sequence; Pulse width modulation;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on
Conference_Location
Honolulu, HI
Print_ISBN
1-4244-0705-2
Type
conf
DOI
10.1109/CIDM.2007.368869
Filename
4221293
Link To Document