• Title of article

    The dangers of creating false classifications due to noise in electronic nose and similar multivariate analyses

  • Author/Authors

    Goodner، نويسنده , , Kevin L and Dreher، نويسنده , , J.Glen and Rouseff، نويسنده , , Russell L، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2001
  • Pages
    6
  • From page
    261
  • To page
    266
  • Abstract
    Randomly generated data with the error limits of 1–10% along with experimental data was employed to demonstrate the dangers of over-fitting data which creates artificial differentiation. Analysis of variance (ANOVA), principal components analysis (PCA), and discriminant function analysis (DFA) were employed for the data analysis. In cases, where the ratio of samples to variables (features) falls below six, single class systems containing only random noise and random groupings can be misclassified into more than a single group when the discriminate techniques are employed. The smaller the group size, the more erroneous classifications are made. Larger sample sizes minimize the random noise and allow the true differences to show. A minimum number of variable (features) should be employed with developing classification models to avoid over-fitting data. The ratio of data points to variables should be at least six to avoid over-fitting classification errors with validation of the model using data points not used in generating the model.
  • Keywords
    Chemical sensor , statistics , Chemometrics
  • Journal title
    Sensors and Actuators B: Chemical
  • Serial Year
    2001
  • Journal title
    Sensors and Actuators B: Chemical
  • Record number

    1412842