DocumentCode
2264090
Title
Approximate nearest neighbor search on HDD
Author
Himei, Noritaka ; Wada, Toshikazu
Author_Institution
Wakayama Univ., Wakayama, Japan
fYear
2009
fDate
Sept. 27 2009-Oct. 4 2009
Firstpage
2101
Lastpage
2108
Abstract
Nearest Neighbor (NN) search plays important roles in Computer Vision algorithms. Especially, NN search on immensely large amount of image data set stored on the Internet is getting highlighted. For dealing with such huge data, main memory of a single PC is insufficient. As a solution, we propose an approximate NN search on hard disk drive (HDD) in this paper. This algorithm is based on recently proposed Principal Component Hashing (PCH). In our algorithm ¿PCH on HDD¿ (PCHD), the hash bins are represented by the leaf nodes of B+ tree for dealing with the dynamic addition and deletion of the data. Of course, the search time is slower than the original PCH. However, we found some advantages of this approach through the experiments using standard PC and 10000 stored images: 1) the memory consumption is 42 times smaller, 2) the first search time including the cold start-up time is 4.3 times faster (PCH:31.8[s], PCHD: 7.4[s]), 3) and interestingly, the successive searches are accelerated owing to the cache mechanism embedded in the operating system (mean search time decreases from 7.4[s] to 0.64[s]). We also confirmed that our algorithm performs NN search on 1 million image datasets with only 193MB memory consumption; however, PCH cannot, because of the huge memory consumption. These properties reveal that this algorithm is suitable for non-time-critical NN search applications and NN search engine called by web servers, where the search engine starts up in response to occasional queries.
Keywords
Internet; computer vision; disc drives; file organisation; file servers; hard discs; image recognition; image retrieval; operating systems (computers); principal component analysis; search engines; trees (mathematics); B+ tree; Internet; Web servers; approximate nearest neighbor search; cache mechanism; computer vision algorithm; data deletion; dynamic data addition; hard disk drive; hash bins; image datasets; leaf nodes; occasional query; operating system; principal component hashing; search engine; Acceleration; Clustering algorithms; Computer vision; Conferences; Hard disks; Internet; Large-scale systems; Nearest neighbor searches; Neural networks; Search engines;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on
Conference_Location
Kyoto
Print_ISBN
978-1-4244-4442-7
Electronic_ISBN
978-1-4244-4441-0
Type
conf
DOI
10.1109/ICCVW.2009.5457540
Filename
5457540
Link To Document