DocumentCode
149042
Title
Combining temporal and spectral information for Query-by-Example Spoken Term Detection
Author
Gracia, Ciro ; Anguera, Xavier ; Binefa, Xavier
Author_Institution
Dept. of Inf. & Commun. Technol., Univ. Pompeu Fabra, Barcelona, Spain
fYear
2014
fDate
1-5 Sept. 2014
Firstpage
1487
Lastpage
1491
Abstract
We present a system for Query-by-Example Spoken Term Detection on zero-resource languages. The system compares speech patterns by representing the signal using two different acoustic models, a Spectral Acoustic (SA) model covering the spectral characteristics of the signal, and a Temporal Acoustic (TA) model covering the temporal evolution of the speech signal. Given a query and a utterance to be compared, first we compute their posterior probabilities according to each of the two models, compute similarity matrices for each model and combine these into a single enhanced matrix. Subsequence-Dynamic Time Warping (S-DTW) algorithm is used to find optimal subsequence alignment paths on this final matrix. Our experiments on data from the 2013 Spoken Web Search (SWS) task at Mediaeval benchmark evaluation show that this approach provides state of the art results and significantly improves both the single model strategies and the standard metric baselines.
Keywords
audio databases; learning (artificial intelligence); pattern matching; query processing; speech processing; optimal subsequence alignment paths; query-by-example spoken term detection; spectral acoustic model; spectral information; speech patterns; speech signal; subsequence dynamic time warping algorithm; temporal acoustic model; temporal information; zero resource languages; Acoustics; Adaptation models; Computational modeling; Data models; Hidden Markov models; Speech; Vectors; Query by example; long temporal context; unsupervised learning; zero resources languages;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European
Conference_Location
Lisbon
Type
conf
Filename
6952537
Link To Document