DocumentCode :
599143
Title :
An efficient dynamic programming algorithm for phosphorylation site assignment of large-scale mass spectrometry data
Author :
Saeed, Fahad ; Pisitkun, T. ; Hoffert, Jason D. ; Guanghui Wang ; Gucek, M. ; Knepper, Mark A.
Author_Institution :
Epithelial Syst. Biol. Lab., Nat. Heart Lung & Blood Inst. (NHLBI), Bethesda, MD, USA
fYear :
2012
fDate :
4-7 Oct. 2012
Firstpage :
618
Lastpage :
625
Abstract :
Phosphorylation site assignment of large-scale data from high throughput tandem mass spectrometry (LC-MS/MS) data is an important aspect of phosphoproteomics. Correct assignment of phosphorylated residue(s) is important for functional interpretation of the data within a biological context. Common search algorithms (Sequest etc.) for mass spectrometry data are not designed for accurate site assignment; thus, additional algorithms are needed. In this paper, we propose a linear-time and linear-space dynamic programming strategy for phosphorylation site assignment. The algorithm, referred to as PhosSA, optimizes the objective function defined as the summation of peak intensities that are associated with theoretical phosphopeptide fragmentation ions. Quality control is achieved through the use of a post-processing criteria whose value is indicative of the signal-to-noise (S/N) properties and redundancy of the fragmentation spectra. The algorithm is tested using experimentally generated data sets of peptides with known phosphorylation sites while varying the fragmentation strategy (CID or HCD) and molar amounts of the peptides. The algorithm is also compatible with various peptide labeling strategies including SILAC and iTRAQ. PhosSA is shown to achieve > 99% accuracy with a high degree of sensitivity. The algorithm is extremely fast and scalable (able to process up to 0.5 million peptides in an hour). The implemented algorithm is freely available at http://helixweb.nih.gov/ESBL/PhosSA/ for academic purposes.
Keywords :
biochemistry; bioinformatics; dynamic programming; mass spectroscopy; proteins; proteomics; LC-MS-MS data; SILAC; bioinformatics; biological context; efficient dynamic programming algorithm; fragmentation spectra; fragmentation strategy; high throughput tandem mass spectrometry data; iTRAQ; large-scale mass spectrometry data; linear-space dynamic programming strategy; linear-time dynamic programming strategy; phosphoproteomics; phosphorylated residue; phosphorylation site assignment; quality control; signal-to-noise properties; Algorithm design and analysis; Dynamic programming; Heuristic algorithms; Ions; Mass spectroscopy; Peptides; Redundancy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine Workshops (BIBMW), 2012 IEEE International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
978-1-4673-2746-6
Electronic_ISBN :
978-1-4673-2744-2
Type :
conf
DOI :
10.1109/BIBMW.2012.6470210
Filename :
6470210
Link To Document :
بازگشت