DocumentCode :
2730275
Title :
On k-Nearest Neighbor Searching in Non-Ordered Discrete Data Spaces
Author :
Kolbe, D. ; Qiang Zhu ; Pramanik, Sarah
Author_Institution :
Michigan State Univ., East Lansing, MI, USA
fYear :
2007
fDate :
15-20 April 2007
Firstpage :
426
Lastpage :
435
Abstract :
A k-nearest neighbor (k-NN) query retrieves k objects from a database that are considered to be the closest to a given query point. Numerous techniques have been proposed in the past for supporting efficient k-NN searches in continuous data spaces. No such work has been reported in the literature for k-NN searches in a non-ordered discrete data space (NDDS). Performing k-NN searches in an NDDS raises new challenges. The Hamming distance is usually used to measure the distance between two vectors (objects) in an NDDS. Due to the coarse granularity of the Hamming distance, a k-NN query in an NDDS may lead to a large set of candidate solutions, creating a high degree of non-determinism for the query result. We propose a new distance measure, called granularity-enhanced Hamming (GEH) distance, that effectively reduces the number of candidate solutions for a query. We have also considered using multidimensional database indexing for implementing k-NN searches in NDDSs. Our experiments on synthetic and genomic data sets demonstrate that our index-based k-NN algorithm is effective and efficient in finding k-NNs in NDDSs.
Keywords :
database indexing; query processing; coarse granularity; granularity-enhanced Hamming distance; k-nearest neighbor query retrieval; k-nearest neighbor searching; multidimensional database indexing; nonordered discrete data space; Bioinformatics; DNA; Electronic commerce; Genomics; Hamming distance; Indexing; Information retrieval; Multidimensional systems; Multimedia databases; Sequences;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
Conference_Location :
Istanbul
Print_ISBN :
1-4244-0802-4
Type :
conf
DOI :
10.1109/ICDE.2007.367888
Filename :
4221691
Link To Document :
بازگشت