DocumentCode :
3199841
Title :
Regrouping of pattern clusters to reveal characteristics of distinct classes and related classes
Author :
Pei-Yuan Zhou ; Lee, En-Shiun Annie ; Wong, Andrew K. C.
Author_Institution :
Dept. of Comput., Hong Kong Polytech. Univ., Kowloon, China
fYear :
2013
fDate :
18-21 Dec. 2013
Firstpage :
55
Lastpage :
61
Abstract :
Discovering protein patterns for amino acids and their biochemical properties is important for revealing the underlying biophysical models. From this, pattern clustering was introduced in order to relate the discovered protein patterns to taxonomic classes in a localized region of a protein. This paper proposes an algorithm to synthesize and re-group pattern clusters, maximizing their separability in order to reveal class characteristics of the localized region of the protein based on our previous work. To evaluate the pattern clustering and regrouping pattern clusters results, we introduce three evaluation measures: F-measure, class entropy measure, and attribute entropy measure. To validate our proposed algorithm, experiments are run on synthetic data, protein family for amino acid attributes, and chemical property attributes. The experimental results show that: a) the result for regrouping pattern clusters is more accurate in class separation than only using pattern clustering; b) The clusters after regrouping are more distinctly separable with each other than only using pattern clustering; c) two types of pattern clusters are found, with one pertaining to distinct classes and the other associating with two or more related classes; and d) class characteristics are clearly revealed in the data subspace containing the patterns in the pattern clusters. The datasets with chemical properties show that unsupervised techniques can reveal common chemical attributes in the inherent classes as more of the common properties shared by different amino acids are taken into account.
Keywords :
molecular biophysics; pattern clustering; proteins; F-measure; amino acids; attribute entropy measure; biochemical property; class entropy measure; distinct classes; pattern clusters regrouping; protein patterns; related classes; separability; taxonomic classes; Amino acids; Clustering algorithms; Complexity theory; Entropy; Fungi; Pattern clustering; Proteins; local optimal; pattern cluster; protein functionality; regrouping; taxonomy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2013 IEEE International Conference on
Conference_Location :
Shanghai
Type :
conf
DOI :
10.1109/BIBM.2013.6732718
Filename :
6732718
Link To Document :
بازگشت