DocumentCode
2137721
Title
Solving the small sample size problem in protein subcellular localization prediction
Author
Tong Wang ; Xiaoxia Cao ; Tian Xia ; Zhizhen Yang
Author_Institution
Inst. of Comput. & Inf., Shanghai Second Polytech. Univ., Shanghai, China
fYear
2012
fDate
16-18 Oct. 2012
Firstpage
915
Lastpage
918
Abstract
In this paper, a new system is proposed to improve the performance of protein subcellular localization prediction. First of all, the protein sequences are quantized into a high dimension space using an effective sequence encoding scheme. However, the problem caused by such representation is small sample size problem, where the data dimension is much larger than the sample size. To sort out this problem, a new dimension reduction algorithm is introduced. It extracts the essential features from the high dimension feature space and does not suffer from small sample size problem. Then, an efficient classifier is employed to recognize the subcellular localization of proteins according to the new features after dimension reduction.
Keywords
feature extraction; molecular biophysics; proteins; feature extraction; high dimension feature space; protein sequences; protein subcellular localization prediction; reduction algorithm; sequence encoding scheme; small sample size problem; manifold learning; prediction system; small sample size problem;
fLanguage
English
Publisher
ieee
Conference_Titel
Biomedical Engineering and Informatics (BMEI), 2012 5th International Conference on
Conference_Location
Chongqing
Print_ISBN
978-1-4673-1183-0
Type
conf
DOI
10.1109/BMEI.2012.6513152
Filename
6513152
Link To Document