DocumentCode :
180175
Title :
Zero-resource spoken term detection using hierarchical graph-based similarity search
Author :
Aoyama, Konosuke ; Ogawa, Anna ; Hattori, Toshihiro ; Hori, Toshikazu ; Nakamura, A.
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
7093
Lastpage :
7097
Abstract :
This paper presents fast zero-resource spoken term detection (STD) in a large-scale data set, by using a hierarchical graph-based similarity search method (HGSS). HGSS is an improved graph-based similarity search method (GSS) in terms of a search space for high-speed performance. Instead of a degree-reduced k-nearest neighbor (k-DR) graph for GSS, a hierarchical k-DR graph, which is constructed based on a cluster structure in the corresponding k-DR graph, is used as an index for HGSS. A search algorithm for the hierarchical k-DR graph effectively utilizes the cluster structure, resulting in the reduction of the search space. HGSS inherits the useful property of GSS; it is available for any data sets without limits on a data type nor a defined dissimilarity since a graph is a general expression of a relationship between objects. A vertex and an edge in the hierarchical graph correspond to a Gaussian mixture model (GMM) posterior-gram segment and the relationship between a pair of GMM poste-riorgram segments, which is measured by dynamic time warping, respectively. Experimental results demonstrate that HGSS successfully reduces the computational cost by more than 40 % at nearly the same accuracy, compared to GSS.
Keywords :
Gaussian processes; graph theory; mixture models; search problems; signal detection; speech processing; GMM posterior-gram segments; Gaussian mixture model posterior-gram segment; HGSS; STD; cluster structure; degree-reduced k-nearest neighbor graph; dynamic time warping; hierarchical graph-based similarity search; hierarchical k-DR graph; search algorithm; search space reduction; zero-resource spoken term detection; Accuracy; Acoustics; Clustering algorithms; Indexes; Search methods; Speech; Speech processing; Dynamic time warping; Neighborhood graph index; Query-by-example search; Spoken term detection; Zero resource;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6854976
Filename :
6854976
Link To Document :
بازگشت