• DocumentCode
    2736509
  • Title

    Reducing the cost of protein identifications from mass spectrometry databases

  • Author

    Logan, B. ; Kontothanassis, L. ; Goddeau, D. ; Moreno, P.J. ; Hookway, R. ; Sarracino, D.

  • Author_Institution
    Hewlett-Packard Labs, Cambridge, MA, USA
  • Volume
    2
  • fYear
    2004
  • fDate
    1-5 Sept. 2004
  • Firstpage
    3060
  • Lastpage
    3063
  • Abstract
    We present two techniques to improve the computational efficiency of protein discovery from mass spectrometry databases: noise filtering and hierarchical searching. Our approaches are orthogonal to existing algorithms and are based on the observation that typical mass spectrometry data contains a large amount of noise that can lead to wasteful computation. Our first improvement uses standard machine learning techniques with novel feature vectors derived from the mass spectra to identify and filter the noisy spectra. We demonstrate this approach results in computational gains of around 38% with less than 10% loss of peptides. Additionally we present a hierarchical searching scheme in which most samples are matched against a small database at low computational cost, leaving only a small number of samples to be searched against larger databases. Combining this scheme with the machine learning filters leads to a further performance improvement of 3%.
  • Keywords
    biochemistry; database management systems; learning (artificial intelligence); mass spectra; mass spectroscopy; medical information systems; medical signal processing; molecular biophysics; proteins; workflow management software; computational efficiency; feature vectors; hierarchical searching; machine learning techniques; mass spectra; mass spectrometry databases; noise filtering; noisy spectra; peptides; protein identification; workflow management; Computational efficiency; Costs; Filtering; Filters; Machine learning; Machine learning algorithms; Mass spectroscopy; Peptides; Proteins; Spatial databases; : mass spectrometry; machine learning; noise filtering; workflow management;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Engineering in Medicine and Biology Society, 2004. IEMBS '04. 26th Annual International Conference of the IEEE
  • Conference_Location
    San Francisco, CA
  • Print_ISBN
    0-7803-8439-3
  • Type

    conf

  • DOI
    10.1109/IEMBS.2004.1403865
  • Filename
    1403865