• DocumentCode
    2137721
  • Title

    Solving the small sample size problem in protein subcellular localization prediction

  • Author

    Tong Wang ; Xiaoxia Cao ; Tian Xia ; Zhizhen Yang

  • Author_Institution
    Inst. of Comput. & Inf., Shanghai Second Polytech. Univ., Shanghai, China
  • fYear
    2012
  • fDate
    16-18 Oct. 2012
  • Firstpage
    915
  • Lastpage
    918
  • Abstract
    In this paper, a new system is proposed to improve the performance of protein subcellular localization prediction. First of all, the protein sequences are quantized into a high dimension space using an effective sequence encoding scheme. However, the problem caused by such representation is small sample size problem, where the data dimension is much larger than the sample size. To sort out this problem, a new dimension reduction algorithm is introduced. It extracts the essential features from the high dimension feature space and does not suffer from small sample size problem. Then, an efficient classifier is employed to recognize the subcellular localization of proteins according to the new features after dimension reduction.
  • Keywords
    feature extraction; molecular biophysics; proteins; feature extraction; high dimension feature space; protein sequences; protein subcellular localization prediction; reduction algorithm; sequence encoding scheme; small sample size problem; manifold learning; prediction system; small sample size problem;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Biomedical Engineering and Informatics (BMEI), 2012 5th International Conference on
  • Conference_Location
    Chongqing
  • Print_ISBN
    978-1-4673-1183-0
  • Type

    conf

  • DOI
    10.1109/BMEI.2012.6513152
  • Filename
    6513152