Title of article :
Standard machine learning algorithms applied to UPLC-TOF/MS metabolic fingerprinting for the discovery of wound biomarkers in Arabidopsis thaliana
Author/Authors :
Boccard، نويسنده , , Julien and Kalousis، نويسنده , , Alexandros and Hilario، نويسنده , , Melanie and Lantéri، نويسنده , , Pierre and Hanafi، نويسنده , , Mohamed and Mazerolles، نويسنده , , Gérard and Wolfender، نويسنده , , Jean-Luc and Carrupt، نويسنده , , Pierre-Alain and Rudaz، نويسنده , , Serge، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 2010
Abstract :
Metabolomics experiments involve the simultaneous detection of a high number of metabolites leading to large multivariate datasets and computer-based applications are required to extract relevant biological information. A high-throughput metabolic fingerprinting approach based on ultra performance liquid chromatography (UPLC) and high resolution time-of-flight (TOF) mass spectrometry (MS) was developed for the detection of wound biomarkers in the model plant Arabidopsis thaliana. High-dimensional data were generated and analysed with chemometric methods.
s, machine learning classification algorithms constitute promising tools to decipher complex metabolic phenotypes but their application remains however scarcely reported in that research field. The present work proposes a comparative evaluation of a set of diverse machine learning schemes in the context of metabolomic data with respect to their ability to provide a deeper insight into the metabolite network involved in the wound response. Standalone classifiers, i.e. J48 (decision tree), kNN (instance-based learner), SMO (support vector machine), multilayer perceptron and RBF network (neural networks) and Naive Bayes (probabilistic method), or combinations of classification and feature selection algorithms, such as Information Gain, RELIEF-F, Correlation Feature-based Selection and SVM-based methods, are concurrently assessed and cross-validation resampling procedures are used to avoid overfitting.
tudy demonstrates that machine learning methods represent valuable tools for the analysis of UPLC-TOF/MS metabolomic data. In addition, remarkable performance was achieved, while the modelsʹ stability showed the robustness and the interpretability potential. The results allowed drawing attention to both temporal and spatial metabolic patterns in the context of stress signalling and highlighting relevant biomarkers not evidenced with standard data treatment.
Keywords :
Machine Learning , mass spectrometry , Metabolomics , Arabidopsis thaliana , UPLC-MS , DATA MINING
Journal title :
Chemometrics and Intelligent Laboratory Systems
Journal title :
Chemometrics and Intelligent Laboratory Systems