DocumentCode
470468
Title
Principal Component Analysis of O-linked Glycosylation Sites in Protein Sequence
Author
Yang, Xue-Mei ; Chen, Yen-wei ; Ito, Masahiro ; Nishikawa, Ikuko
Author_Institution
Xianyang Normal Univ., Xianyang
Volume
1
fYear
2007
fDate
26-28 Nov. 2007
Firstpage
121
Lastpage
126
Abstract
In this paper, a detailed analysis about the structure of O-glycosylated protein has been done by calculating the positional probability functions (PPFs) and principal components. We found that the content of proline , serine , threonine and alanine in O-glycosylated protein is higher than those in nonglycosylated protein. Furthermore, we also found that the serine near N or C terminus was easily glycosylated and the threonine near N terminus is easily glycosylated. The prediction was also done as a classification problem. The test protein sequence is projected to the common subspace and then by calculating the distance between the projection and each class center, the test protein sequence can be assigned into the "nearest" class. The prediction accuracy is about 60%-100%.
Keywords
biology computing; principal component analysis; proteins; O-glycosylated protein; alanine; positional probability function; principal component analysis; proline; serine; threonine; Accuracy; Amino acids; Educational institutions; Information analysis; Information science; Mathematics; Pattern analysis; Principal component analysis; Protein sequence; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Information Hiding and Multimedia Signal Processing, 2007. IIHMSP 2007. Third International Conference on
Conference_Location
Kaohsiung
Print_ISBN
978-0-7695-2994-1
Type
conf
DOI
10.1109/IIH-MSP.2007.248
Filename
4457507
Link To Document