Abstract :
We examine the problem of large scale nearest neighbor search in high dimensional spaces and propose a new approach based on the close relationship between nearest neighbor search and that of signal representation and quantization. Our contribution is a very simple and efficient quantization technique using transform coding and product quantization. We demonstrate its effectiveness in several settings, including large-scale retrieval, nearest neighbor classification, feature matching, and similarity search based on the bag-of-words representation. Through experiments on standard data sets we show it is competitive with state-of-the-art methods, with greater speed, simplicity, and generality. The resulting compact representation can be the basis for more elaborate hierarchical search structures for sub-linear approximate search. However, we demonstrate that optimized linear search using the quantized representation is extremely fast and trivially parallelizable on modern computer architectures, with further acceleration possible by way of GPU implementation.
Keywords :
computer architecture; pattern recognition; signal representation; transform coding; GPU implementation; computer architectures; high dimensions; nearest neighbor search; product quantization; signal quantization; signal representation; transform coding; Acceleration; Computer architecture; Computer vision; Degradation; Large-scale systems; Nearest neighbor searches; Quantization; Signal representations; System performance; Transform coding;