Title of article :
A comparison of model-based and regression classification techniques applied to near infrared spectroscopic data in food authentication studies
Author/Authors :
Toher، نويسنده , , Deirdre and Downey، نويسنده , , Gerard J. Murphy، نويسنده , , Thomas Brendan Murphy، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 2007
Abstract :
Classification methods can be used to classify samples of unknown type into known types. Many classification methods have been proposed in the chemometrics, statistical and computer science literature.
based classification methods have been developed from a statistical modelling viewpoint. This approach allows for uncertainty in the classification procedure to be quantified using probabilities. Linear discriminant analysis and quadratic discriminant analysis are particular model-based classification methods.
l least squares discriminant analysis is commonly used in food authentication studies based on spectroscopic data. This method uses partial least squares regression with a binary outcome variable for two-group classification problems.
s paper, model-based classification is compared to partial least squares discriminant analysis for its ability to correctly classify pure and adulterated honey samples when the honey has been extended by three different adulterants. Two model selection criteria are examined: the Bayesian Information Criterion and 5-fold cross validation. The methods are compared using the classification performance and the interpretability of the results.
ition, since the percentage of adulterated samples in any given sample set is unlikely to be known in a real-life setting, the ability of updating procedures within model-based clustering to accurately predict the adulterated samples, even when the proportion of pure to adulterated samples in the training data is grossly unrepresentative of the true situation, is studied in detail.
rformance of both model-based and partial least squares discriminant analysis is found to be robust to the composition of the training data and to model selection method. The Bayesian Information Criterion is shown to be more robust than 5-fold cross validation as a model se-lection method, especially when the training data set is very small and unrepresentative of the entire data set.
Keywords :
Food authenticity , NIR spectroscopy , Classification , Model-based classification , Partial least , squares regression
Journal title :
Chemometrics and Intelligent Laboratory Systems
Journal title :
Chemometrics and Intelligent Laboratory Systems