DocumentCode :
3167566
Title :
Facilitating open vocabulary spoken term detection using a multiple pass hybrid search algorithm
Author :
Norouzian, Atta ; Rose, Richard
Author_Institution :
Dept. of ECE, McGill Univ., Montreal, QC, Canada
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
5169
Lastpage :
5172
Abstract :
This paper presents an efficient approach to spoken term detection (STD) from unstructured audio recordings using word lattices generated off-line from an automatic speech recognition (ASR) system. The approach facilitates open vocabulary STD and focuses specifically on reducing the difference between detection performance obtained for within-vocabulary (IV) and out-of-vocabulary (OOV) search terms. Improved OOV detection performance is obtained by using a two-pass search procedure. Candidate audio segments are retrieved from an index of word lattice paths in the first pass. Locations of OOV search terms are detected in the second pass from a constrained alignment of phonemic expansions of the query terms with phoneme sequences obtained from acoustic segments using an unconstrained neural network based phone decoder. It is found that the combination of first pass segment retrieval and second pass term verification significantly increases STD performance for OOV query terms with no increase in search time for utterances taken from a lecture speech domain.
Keywords :
acoustic signal processing; audio recording; neural nets; performance evaluation; query processing; speech recognition; vocabulary; word processing; ASR system; OOV detection performance improvement; OOV query terms; OOV search term locations; acoustic segments; audio retrieval; audio segments; automatic speech recognition; constrained alignment; detection performance; first pass segment retrieval; lecture speech domain; multiple pass hybrid search algorithm; offline word lattice generation; open vocabulary STD; open vocabulary spoken term detection; out-of-vocabulary search terms; phone decoder; phoneme sequences; phonemic expansions; second pass term verification; two-pass search procedure; unconstrained neural network; unstructured audio recording; within-vocabulary search terms; Acoustics; Decoding; Indexing; Lattices; Speech; Vocabulary; Spoken term detection; automatic speech recognition; spoken utterance retrieval;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6289084
Filename :
6289084
Link To Document :
بازگشت