DocumentCode :
3126388
Title :
Semi-supervised Discriminant Hashing
Author :
Kim, Saehoon ; Choi, Seungjin
Author_Institution :
Dept. of Comput. Sci., Pohang Univ. of Sci. & Technol., Pohang, South Korea
fYear :
2011
fDate :
11-14 Dec. 2011
Firstpage :
1122
Lastpage :
1127
Abstract :
Hashing refers to methods for embedding high dimensional data into a similarity-preserving low-dimensional Hamming space such that similar objects are indexed by binary codes whose Hamming distances are small. Learning hash functions from data has recently been recognized as a promising approach to approximate nearest neighbor search for high dimensional data. Most of ´learning to hash´ methods resort to either unsupervised or supervised learning to determine hash functions. Recently semi-supervised learning approach was introduced in hashing where pair wise constraints (must link and cannot-link) using labeled data are leveraged while unlabeled data are used for regularization to avoid over-fitting. In this paper we base our semi-supervised hashing on linear discriminant analysis, where hash functions are learned such that labeled data are used to maximize the separability between binary codes associated with different classes while unlabeled data are used for regularization as well as for balancing condition and pair wise decor relation of bits. The resulting method is referred to as semi-supervised discriminant hashing (SSDH). Numerical experiments on MNIST and CIFAR-10 datasets demonstrate that our method outperforms existing methods, especially in the case of short binary codes.
Keywords :
approximation theory; binary codes; cryptography; information retrieval; learning (artificial intelligence); statistical analysis; CIFAR-10 dataset; Hamming distances; MNIST dataset; binary codes; hash function learning; high dimensional data; labeled data; linear discriminant analysis; nearest neighbor search approximation; semi-supervised discriminant hashing; similarity search; similarity-preserving low-dimensional Hamming space; unlabeled data; Binary codes; Decorrelation; Eigenvalues and eigenfunctions; Hamming distance; Training; Training data; Vectors; Hashing; regularized discriminant analysis; semisupervised;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver,BC
ISSN :
1550-4786
Print_ISBN :
978-1-4577-2075-8
Type :
conf
DOI :
10.1109/ICDM.2011.128
Filename :
6137325
Link To Document :
بازگشت