DocumentCode :
2264090
Title :
Approximate nearest neighbor search on HDD
Author :
Himei, Noritaka ; Wada, Toshikazu
Author_Institution :
Wakayama Univ., Wakayama, Japan
fYear :
2009
fDate :
Sept. 27 2009-Oct. 4 2009
Firstpage :
2101
Lastpage :
2108
Abstract :
Nearest Neighbor (NN) search plays important roles in Computer Vision algorithms. Especially, NN search on immensely large amount of image data set stored on the Internet is getting highlighted. For dealing with such huge data, main memory of a single PC is insufficient. As a solution, we propose an approximate NN search on hard disk drive (HDD) in this paper. This algorithm is based on recently proposed Principal Component Hashing (PCH). In our algorithm ¿PCH on HDD¿ (PCHD), the hash bins are represented by the leaf nodes of B+ tree for dealing with the dynamic addition and deletion of the data. Of course, the search time is slower than the original PCH. However, we found some advantages of this approach through the experiments using standard PC and 10000 stored images: 1) the memory consumption is 42 times smaller, 2) the first search time including the cold start-up time is 4.3 times faster (PCH:31.8[s], PCHD: 7.4[s]), 3) and interestingly, the successive searches are accelerated owing to the cache mechanism embedded in the operating system (mean search time decreases from 7.4[s] to 0.64[s]). We also confirmed that our algorithm performs NN search on 1 million image datasets with only 193MB memory consumption; however, PCH cannot, because of the huge memory consumption. These properties reveal that this algorithm is suitable for non-time-critical NN search applications and NN search engine called by web servers, where the search engine starts up in response to occasional queries.
Keywords :
Internet; computer vision; disc drives; file organisation; file servers; hard discs; image recognition; image retrieval; operating systems (computers); principal component analysis; search engines; trees (mathematics); B+ tree; Internet; Web servers; approximate nearest neighbor search; cache mechanism; computer vision algorithm; data deletion; dynamic data addition; hard disk drive; hash bins; image datasets; leaf nodes; occasional query; operating system; principal component hashing; search engine; Acceleration; Clustering algorithms; Computer vision; Conferences; Hard disks; Internet; Large-scale systems; Nearest neighbor searches; Neural networks; Search engines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-4442-7
Electronic_ISBN :
978-1-4244-4441-0
Type :
conf
DOI :
10.1109/ICCVW.2009.5457540
Filename :
5457540
Link To Document :
بازگشت