• DocumentCode
    2193051
  • Title

    Accelerating high-dimensional nearest neighbor queries

  • Author

    Lang, Christian A. ; Singh, Ambuj K.

  • Author_Institution
    Dept. of Comput. Sci., California Univ., Santa Barbara, CA, USA
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    109
  • Lastpage
    118
  • Abstract
    The performance of nearest neighbor (NN) queries degrades noticeably with increasing dimensionality of the data due to reduced selectivity of high-dimensional data and an increased number of seek operations during NN-query execution. If the NN-radii were known in advance, the disk accesses could be reordered such that seek operations are minimized. We therefore propose a new way of estimating the NN-radius based on the fractal dimensionality and sampling. It is applicable to any page-based index structure. We show that the estimation error is considerably lower than for previous approaches. In the second part of the paper, we present two applications of this technique. We show how the radius estimations can be used to transform k-NN queries into at most two range queries, and how it can be used to reduce the number of page reads during all-NN queries. In both cases, we observe significant speedups over traditional techniques for synthetic and real-world data.
  • Keywords
    fractals; natural sciences computing; query processing; very large databases; disk accesses; estimation error; fractal dimensionality; high-dimensional nearest neighbor queries; page-based index structure; radius estimations; range queries; reduced selectivity; sampling; seek operations; Acceleration; Bioinformatics; Computer science; Databases; Fractals; Genomics; Image segmentation; Nearest neighbor searches; Neural networks; Sampling methods;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Scientific and Statistical Database Management, 2002. Proceedings. 14th International Conference on
  • ISSN
    1099-3371
  • Print_ISBN
    0-7695-1632-7
  • Type

    conf

  • DOI
    10.1109/SSDM.2002.1029711
  • Filename
    1029711