Title :
Optimal Hash Functions for Approximate Matches on the
-Cube
Author :
Gordon, Daniel M. ; Miller, Victor S. ; Ostapenko, Peter
Author_Institution :
IDA Center for Commun. Res., San Diego, CA, USA
fDate :
3/1/2010 12:00:00 AM
Abstract :
One way to find near-matches in large datasets is to use hash functions. In recent years locality-sensitive hash functions for various metrics have been given; for the Hamming metric projecting onto k bits is simple hash function that performs well. In this paper, we investigate alternatives to projection. For various parameters hash functions given by complete decoding algorithms for error-correcting codes work better, and asymptotically random codes perform better than projection.
Keywords :
approximation theory; cryptography; decoding; error correction codes; random codes; Hamming metric; decoding algorithms; error-correcting codes; optimal hash functions; random codes; Decoding; Error correction codes; Hamming distance; Information retrieval; Linear code; Nearest neighbor searches; Sequences; Vectors; Approximate nearest neighbor; distance-sum optimal sets; error exponent; locality-sensitive hashing; random codes;
Journal_Title :
Information Theory, IEEE Transactions on
DOI :
10.1109/TIT.2009.2039037