• DocumentCode
    1057955
  • Title

    A Partial Set Covering Model for Protein Mixture Identification Using Mass Spectrometry Data

  • Author

    He, Zengyou ; Yang, Can ; Yu, Weichuan

  • Author_Institution
    Sch. of Software, Dalian Univ. of Technol., Dalian, China
  • Volume
    8
  • Issue
    2
  • fYear
    2011
  • Firstpage
    368
  • Lastpage
    380
  • Abstract
    Protein identification is a key and essential step in mass spectrometry (MS) based proteome research. To date, there are many protein identification strategies that employ either MS data or MS/MS data for database searching. While MS-based methods provide wider coverage than MS/MS-based methods, their identification accuracy is lower since MS data have less information than MS/MS data. Thus, it is desired to design more sophisticated algorithms that achieve higher identification accuracy using MS data. Peptide Mass Fingerprinting (PMF) has been widely used to identify single purified proteins from MS data for many years. In this paper, we extend this technology to protein mixture identification. First, we formulate the problem of protein mixture identification as a Partial Set Covering (PSC) problem. Then, we present several algorithms that can solve the PSC problem efficiently. Finally, we extend the partial set covering model to both MS/MS data and the combination of MS data and MS/MS data. The experimental results on simulated data and real data demonstrate the advantages of our method: 1) it outperforms previous MS-based approaches significantly; 2) it is useful in the MS/MS-based protein inference; and 3) it combines MS data and MS/MS data in a unified model such that the identification performance is further improved.
  • Keywords
    bioinformatics; linear programming; mass spectra; proteins; proteomics; search problems; MS/MS-based method; MS/MS-based protein inference; database searching; mass spectrometry data; partial set covering model; peptide mass fingerprinting; protein mixture identification; proteome research; Algorithm design and analysis; Bioinformatics; Computational biology; Databases; Fingerprint recognition; Mass spectroscopy; Peptides; Protein engineering; Proteomics; Protein identification; linear programming; mass spectrometry; optimization.; peptide mass fingerprinting; proteomics; set covering; Algorithms; Databases, Protein; Mass Spectrometry; Peptide Mapping; Proteins; Proteomics; Sequence Analysis, Protein;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2009.54
  • Filename
    5066962