Title :
Principal Component Analysis of O-linked Glycosylation Sites in Protein Sequence
Author :
Yang, Xue-Mei ; Chen, Yen-wei ; Ito, Masahiro ; Nishikawa, Ikuko
Author_Institution :
Xianyang Normal Univ., Xianyang
Abstract :
In this paper, a detailed analysis about the structure of O-glycosylated protein has been done by calculating the positional probability functions (PPFs) and principal components. We found that the content of proline , serine , threonine and alanine in O-glycosylated protein is higher than those in nonglycosylated protein. Furthermore, we also found that the serine near N or C terminus was easily glycosylated and the threonine near N terminus is easily glycosylated. The prediction was also done as a classification problem. The test protein sequence is projected to the common subspace and then by calculating the distance between the projection and each class center, the test protein sequence can be assigned into the "nearest" class. The prediction accuracy is about 60%-100%.
Keywords :
biology computing; principal component analysis; proteins; O-glycosylated protein; alanine; positional probability function; principal component analysis; proline; serine; threonine; Accuracy; Amino acids; Educational institutions; Information analysis; Information science; Mathematics; Pattern analysis; Principal component analysis; Protein sequence; Testing;
Conference_Titel :
Intelligent Information Hiding and Multimedia Signal Processing, 2007. IIHMSP 2007. Third International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-0-7695-2994-1
DOI :
10.1109/IIH-MSP.2007.248