• DocumentCode
    659585
  • Title

    A novel integrated method for human multiplex protein subcellular localization prediction

  • Author

    Hong Gu ; Junzhe Cao

  • Author_Institution
    Sch. of Control Sci. & Eng., Dalian Univ. of Technol., Dalian, China
  • fYear
    2013
  • fDate
    6-9 Oct. 2013
  • Firstpage
    58
  • Lastpage
    62
  • Abstract
    Protein subcellular localization prediction based on machine learning is a research focus in bioinformatics. The fast growth of protein sequences in databases leads to be hard to label enough protein samples only by experts for training a learner to get satisfying prediction result. This paper proposes a novel integrated method for human multiplex protein subcellular localization prediction. In this method, to avoid artificially evaluating and labeling the big data of unseen proteins, an active sample selection algorithm is presented to pick out protein samples with non-experimental labels as supplementary training data to help train an ensemble predictor, which includes a protein identifying module, a single-label classifier and a multilabel classifier. The numerical experiments show the effectiveness of the proposed approach.
  • Keywords
    Big Data; bioinformatics; learning (artificial intelligence); molecular biophysics; pattern classification; proteins; active sample selection algorithm; big data evaluation; big data labeling; bioinformatics; ensemble predictor; human multiplex protein subcellular localization prediction; machine learning; multilabel classifier; novel integrated method; protein identifying module; protein sequence growth; single-label classifier; Amino acids; Bioinformatics; Classification algorithms; Multiplexing; Prediction algorithms; Proteins; Training; active learning; big data; multiplex protein; protein subcellular localizaiton; transductive learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data, 2013 IEEE International Conference on
  • Conference_Location
    Silicon Valley, CA
  • Type

    conf

  • DOI
    10.1109/BigData.2013.6691734
  • Filename
    6691734