• DocumentCode
    3722777
  • Title

    Fast Approximate near Neighbor Algorithm by Clustering in High Dimensions

  • Author

    Hung Tran-The;Vinh Nguyen Van;Minh Hoang Anh

  • Author_Institution
    Z8 FPT Software JSC, Hanoi, Vietnam
  • fYear
    2015
  • Firstpage
    274
  • Lastpage
    279
  • Abstract
    This paper addresses the (r, 1 + ϵ)-approximate near neighbor problem ( or (r, 1 + v)-NN) that is defined as follows: given a set of n points in a d-dimensional space, a query point q and parameter 0 <; δ <; 1, build a data structure that reports a point within distance (1 + ϵ)r from q with probability 1 - δ, if there is a point in the data set within distance r from q. We present an algorithm for this problem in the Hamming space by using a new clustering technique, the triangle inequality and the locality sensitive hashing approach. Our algorithm achieves O(n1+ 2ρ/1+2ρ + dn) space, and O(fdn 2ρ/1+2ρ) query time, where f is generally a small integer, ρ is a parameter of the algorithm in (STOC 1998) [1] or (STOC 2015) [2]. Our results show that we can improve the algorithms in [1], [2] when value ϵ is small (ϵ <; 1 for [1] and ϵ <; 0.5 for [2]).
  • Keywords
    "Clustering algorithms","Approximation algorithms","Partitioning algorithms","Approximation methods","Databases","Algorithm design and analysis","Electronic mail"
  • Publisher
    ieee
  • Conference_Titel
    Knowledge and Systems Engineering (KSE), 2015 Seventh International Conference on
  • Type

    conf

  • DOI
    10.1109/KSE.2015.72
  • Filename
    7371795