Title :
Amino Acid Composition Distribution: a Novel Sequence Representation for Prediction of Protein Subcellular Localization
Author :
Shi, Jianyu ; Zhang, Shaowu ; Pan, Quan ; Zhou, Guo-Ping
Author_Institution :
Sch. of Comput. Sci., Northwestern Polytech. Univ., Xi´´an
Abstract :
A novel representation of protein sequence, amino acid composition distribution (AACD), is introduced to perform prediction of subcellular localization in this paper. First, a protein sequence is divided equally into multiple segments. Then, amino acid composition of each segment is calculated in series. After that, each protein sequence can be represented a feature vector. Finally, feature vectors of all sequences are further input into multi-class support vector machines to predict the subcellular localization. The results show that AACD is more effective to represent protein sequence and is non-sensitive to sequence similarity because of the better ability to reflect the information of protein subcellular localization.
Keywords :
biology computing; cellular biophysics; molecular biophysics; proteins; support vector machines; amino acid composition distribution; biology computing; protein sequence; protein subcellular localization; support vector machines; Amino acids; Automation; Chemistry; Computer science; Databases; Prediction methods; Protein sequence; Sequences; Support vector machines; Testing;
Conference_Titel :
Bioinformatics and Biomedical Engineering, 2007. ICBBE 2007. The 1st International Conference on
Conference_Location :
Wuhan
Print_ISBN :
1-4244-1120-3
DOI :
10.1109/ICBBE.2007.33