Title :
Chaos Game Representation for Discriminating Thermophilic from Mesophilic Protein Sequences
Author :
Hu, Xue-Hai ; Xia, Jing-Bo ; Niu, Xiao-Hui ; Ma, Xuan ; Song, Chao-Hong ; Shi, Feng
Author_Institution :
Coll. of Sci., Huazhong Agric. Univ., Wuhan, China
Abstract :
Can sequence analysis tell us about the function of protein? A basic question in protein science is which kind of proteins extent thermostability. Chaos game representation (CGR) can investigate the patterns hiding in protein sequence, visually revealing previously unknown structure. In this paper, we convert every protein sequence into a 20-dimensional vector by CGR algorithm, and based on these vectors we discriminate thermophiles from mesophiles using support vector machine (SVM). The overall accuracy achieves 100% in resubstitution test, and 87.12% in Jackknife test. Moreover, Matthews correlation coefficients (MCC) is 0.745.
Keywords :
biology computing; biothermics; correlation methods; game theory; proteins; sequences; statistical testing; support vector machines; thermal stability; vectors; 20-dimensional vector; Jackknife test; Matthews correlation coefficients; SVM; chaos game representation; hiding pattern; mesophilic protein sequences; resubstitution test; support vector machine; thermophilic discrimination; thermostability; Amino acids; Bonding; Chaos; DNA; Educational institutions; Microorganisms; Organisms; Protein sequence; Support vector machines; Testing;
Conference_Titel :
Bioinformatics and Biomedical Engineering , 2009. ICBBE 2009. 3rd International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-2901-1
Electronic_ISBN :
978-1-4244-2902-8
DOI :
10.1109/ICBBE.2009.5162487