Title :
Unsupervised query-by-example spoken term detection using segment-based Bag of Acoustic Words
Author :
George, Boby ; Yegnanarayana, B.
Author_Institution :
Speech & Vision Lab., Int. Inst. of Inf. Technol., Hyderabad, India
Abstract :
In this work, we present an unsupervised framework to address the problem of spotting spoken terms in large speech databases. The segment-based Bag of Acoustic Words (BoAW) framework proposed is inspired from the Bag of Words (BoW) approach widely used in text retrieval systems. Since this model ignores the sequence information in speech samples for efficient indexing of the database, a Dynamic Time Warping (DTW) based temporal matching technique is used to re-rank the results and restore the time sequence information. The speech data is stored efficiently in an inverted index which makes the retrieval very fast, thus making this framework particularly useful for searching large databases. We address the issue of choosing the appropriate size of the segment of speech for reliable indexing. Comparison with other query-by-example spoken term detection systems shows that the proposed system outperforms the rest.
Keywords :
query processing; speech recognition; unsupervised learning; dynamic time warping; segment based bag of acoustic words; sequence information; temporal matching technique; text retrieval systems; unsupervised query by example spoken term detection; Acoustics; Histograms; Indexing; Speech; Vocabulary; Bag of Acoustic Words; query-by-example; segment ranking; spoken term detection; template matching; unsupervised learning;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854984