DocumentCode :
3031634
Title :
Speeding up subcellular localization by extracting informative regions of protein sequences for profile alignment
Author :
Wang, Wei ; Mak, Man-Wai ; Kung, Sun-Yuan
Author_Institution :
Dept. of Electron. & Inf. Eng., Hong Kong Polytech. Univ., Hong Kong, China
fYear :
2010
fDate :
2-5 May 2010
Firstpage :
1
Lastpage :
8
Abstract :
The functions of proteins are closely related to their subcellular locations. In the post-proteomics era, the amount of gene and protein data grows exponentially, which necessitates the prediction of subcellular localization by computational means. This paper proposes mitigating the computation burden of alignment-based approaches to subcellular localization prediction by using the information provided by the N-terminal sorting signals. To this end, a cascaded fusion of cleavage site prediction and profile alignment is proposed. Specifically, the informative segments of protein sequences are identified by a cleavage site predictor. Then, only the informative segments are applied to a homology-based classifier for predicting the subcellular locations. Experimental results on a newly constructed dataset show that the method can make use of the best property of both approaches and can attain an accuracy higher than using the full-length sequences. Moreover, the method can reduce the computation time by 20 folds. We advocate that the method will be important for biologists to conduct large-scale protein annotation or for bioinformaticians to perform preliminary investigations on new algorithms that involve pairwise alignments.
Keywords :
bioinformatics; cellular biophysics; genetics; macromolecules; pattern classification; proteins; sorting; N-terminal sorting signal; bioinformatics; cleavage site prediction; gene; homology-based classifier; informative region; informative segment; post-proteomics era; profile alignment; protein annotation; protein sequence; subcellular localization; Amino acids; Bioinformatics; Genomics; Peptides; Prediction methods; Proteins; Sequences; Sorting; Support vector machine classification; Support vector machines; Subcellular localization; cleavage sites prediction; profiles alignment; protein sequences; support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2010 IEEE Symposium on
Conference_Location :
Montreal, QC
Print_ISBN :
978-1-4244-6766-2
Type :
conf
DOI :
10.1109/CIBCB.2010.5510320
Filename :
5510320
Link To Document :
بازگشت