Title :
Nearest Neighbor Voting in High-Dimensional Data: Learning from Past Occurrences
Author :
Nenad Tomasev;Dunja Mladenic
Author_Institution :
Artificial Intell. Lab., Jozef Stefan Inst., Ljubljana, Slovenia
Abstract :
Hub ness is a recently described aspect of the curse of dimensionality inherent to nearest-neighbor methods. In this paper we present a new approach for exploiting the hub ness phenomenon in k-nearest neighbor classification. We argue that some of the neighbor occurrences carry more information than others, by the virtue of being less frequent events. This observation is related to the hub ness phenomenon and we explore how it affects high-dimensional k-nearest neighbor classification. We propose a new algorithm, Hub ness Information k-Nearest Neighbor (HIKNN), which introduces the k-occurrence informativeness into the hub ness-aware k-nearest neighbor voting framework. Our evaluation on high-dimensional data shows significant improvements over both the basic k-nearest neighbor approach and all previously used hub ness-aware approaches.
Keywords :
"Equations","Training","Approximation algorithms","Bayesian methods","Correlation","Vectors","Mathematical model"
Conference_Titel :
Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on
Print_ISBN :
978-1-4673-0005-6
DOI :
10.1109/ICDMW.2011.127