Title of article :
Bagged k-nearest neighbours classification with uncertainty in the variables Original Research Article
Author/Authors :
Joe L. Villa Medina، نويسنده , , Ricard Boqué، نويسنده , , Joan Ferré، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2009
Pages :
7
From page :
62
To page :
68
Abstract :
An analytical result should be expressed as x ± U, where x is the experimental result obtained for a given variable and U is its uncertainty. This uncertainty is rarely taken into account in supervised classification. In this paper, we propose to include the information about the uncertainty of the experimental results to compute the reliability of classification. The method combines k-nearest neighbours (kNN) with a nested bootstrap scheme, in which a new bootstrap training set is generated using the classical bootstrap in the first level (B times) and a new bootstrap method, called U-bootstrap, in the second level (D times). Two bootstraps are used to reduce the effect of sampling in the first level and the effect of the uncertainty in the second one. These B × D new training bootstrap sets are used to compute the reliability of classification for an unknown object using kNN. The object is classified into the class with the highest reliability. In this method, unlike the classical kNN and Probabilistic Bagged k-nearest neighbours (PBkNN), the reliability of classification changes (increases or decreases) when the uncertainty is increased. These changes depend on the position of the unknown object with respect to the training objects. For the benchmark Wine dataset, we found similar values of classification error rate (CER) than for kNN (5.57%), but lower than Probabilistic Bagged k-nearest neighbours using Hamamotoʹs bootstrap (7.96%) or Efronʹs bootstrap (8.97%).
Keywords :
bootstrap , Nearest neighbours , Uncertainty , classification , reliability
Journal title :
Analytica Chimica Acta
Serial Year :
2009
Journal title :
Analytica Chimica Acta
Record number :
1037376
Link To Document :
بازگشت