DocumentCode
730807
Title
Exemplar-based large vocabulary speech recognition using k-nearest neighbors
Author
Yanbo Xu ; Siohan, Olivier ; Simcha, David ; Kumar, Sanjiv ; Liao, Hank
Author_Institution
Dept. of Electr. & Comput. Eng., Univ. of Maryland Coll. Park, College Park, MD, USA
fYear
2015
fDate
19-24 April 2015
Firstpage
5167
Lastpage
5171
Abstract
This paper describes a large scale exemplar-based acoustic modeling approach for large vocabulary continuous speech recognition. We construct an index of labeled training frames using high-level features extracted from the bottleneck layer of a deep neural network as indexing features. At recognition time, each test frame is turned into a query and a set of k-nearest neighbor frames is retrieved from the index. This set is further filtered using majority voting and the remaining frames are used to derive an estimate of the context-dependent state posteriors of the query, which can then be used for recognition. Using an approximate nearest neighbor search approach based on asymmetric hashing, we are able to construct an index on over 25,000 hours of training data. We present both frame classification and recognition experiments on a Voice Search task.
Keywords
feature extraction; file organisation; neural nets; speech recognition; vocabulary; voice equipment; acoustic modeling; asymmetric hashing; context-dependent state posteriors; deep neural network; feature extraction; k-nearest neighbor; recognition time; vocabulary speech recognition; voice search task; Electronic publishing; Indexes; Information services; Market research; Speech recognition; Training; Vocabulary; acoustic modeling; deep neural network; exemplar-based recognition; k-Nearest Neighbor;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178956
Filename
7178956
Link To Document