DocumentCode :
2219971
Title :
Distilling classification models from cross validation runs: an application to mass spectrometry
Author :
Kalousis, Alexandros ; Prados, Julien ; Sanchez, Jean-Charles ; Allard, Laure ; Hilario, Melanie
Author_Institution :
CSD, Geneva Univ., Switzerland
fYear :
2004
fDate :
15-17 Nov. 2004
Firstpage :
113
Lastpage :
119
Abstract :
We present work on a proteomics application. More specifically, from the domain of mass-spectrometry and the identification of biomarkers for stroke attacks. Mass-spectrometry based biomarker identification is an application that sets a number of challenges to the knowledge discovery process. We describe how we tackle them and present a number of machine learning experiments that we performed in order to identify the most suitable learning algorithm for the given problem. However working with real world applications one of the main issues apart from good classification performance is an indication of the factors that really determine the classification decision. Usually based on the results of a resampled-based performance estimation, e.g. cross validation, an algorithm is selected that will provide the operational classification model. On a next step the operational model should be constructed, nevertheless it is not obvious how this should be done since in resampled-based procedures a number of different models are created. We propose a method for linear classifiers that examines the different models produced with cross-validation. The method examines the stability of the models produced from the different training folds and combines them to provide a single model.
Keywords :
biology computing; data mining; learning (artificial intelligence); mass spectroscopy; pattern classification; proteins; support vector machines; SVM; biomarker identification; cross validation; cross validation runs; knowledge discovery process; machine learning experiments; mass spectrometry; operational classification model; proteomics application; resampled-based performance estimation; stroke attacks; Biological system modeling; Biomarkers; Chemistry; Laboratories; Machine learning; Machine learning algorithms; Mass spectroscopy; Proteins; Proteomics; Stability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2004. ICTAI 2004. 16th IEEE International Conference on
ISSN :
1082-3409
Print_ISBN :
0-7695-2236-X
Type :
conf
DOI :
10.1109/ICTAI.2004.51
Filename :
1374177
Link To Document :
بازگشت