Title :
Searching for a single mathematical function to address the nonlinear retention time shifts problem in nanoLC-MS data: A fuzzy-evolutionary computational proteomics approach
Author_Institution :
Inst. for Inf. Technol., Nat. Res. Council Canada, Ottawa, ON, Canada
Abstract :
Proteomics involves collecting and analyzing information about proteins within one or more complex samples in order to address a biological problem. One methodology is the use of high performance liquid chromatography coupled mass spectrometry (nanoLC-MS). In such a case, the accurate determination of non-linear peptide retention times between runs is expected to increase the number of identified peptides and hence, proteins. There are many approaches when using a computer for such a problem; including very interactive to completely non-interactive algorithms for finding global and local functions that may be either explicit or implicit. This paper extends previous work and explores finding an explicit global function for which two stages are involved: i) computation of a set of candidate functions (results) by the algorithm, and ii) searching within the set for patterns of interest. For the first stage, three classes of approximating global functions are considered: Class 1 functions that have a completely unknown structure, Class 2 functions that have a tiny amount of domain knowledge incorporated, and Class 3 functions that have a small amount of domain knowledge incorporated. For the second stage, some issues with current similarity measures for mathematical expressions are discussed and a new measure is proposed. Preliminary experimental results with an Evolutionary Computation algorithm called Gene Expression Programming (a variant of Genetic Programming) when used with a fuzzy membership within the fitness function are reported.
Keywords :
biocomputing; evolutionary computation; fuzzy set theory; proteins; proteomics; fuzzy-evolutionary computational proteomics approach; gene expression programming; genetic programming; liquid chromatography coupled mass spectrometry; mathematical function; nanoLC-MS; nanoLC-MS data; nonlinear retention time shifts problem; Biology computing; Current measurement; Evolutionary computation; Genetic programming; Information analysis; Mass spectroscopy; Nanobioscience; Peptides; Proteins; Proteomics;
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2010 IEEE Symposium on
Conference_Location :
Montreal, QC
Print_ISBN :
978-1-4244-6766-2
DOI :
10.1109/CIBCB.2010.5510688