Title : 
Memory efficient subsequence DTW for Query-by-Example Spoken Term Detection
         
        
            Author : 
Anguera, Xavier ; Ferrarons, Miquel
         
        
            Author_Institution : 
Telefonica Res., Barcelona, Spain
         
        
        
        
        
        
            Abstract : 
In this paper we propose a fast and memory efficient Dynamic Time Warping (MES-DTW) algorithm for the task of Query-by-Example Spoken Term Detection (QbE-STD). The proposed algorithm is based on the subsequence-DTW (S-DTW) algorithm, which allows the search for small spoken queries within a much bigger search collection of spoken documents by considering fixed start-end points in the query and discovering optimal matching subsequences along the search collection. The proposed algorithm applies some modifications to S-DTW that make it better suited for the QbE-STD task, including a way to perform the matching with virtually no system memory, optimal when querying large scale databases. We also describe the system used to perform QbE-STD, including an energy-based quantification for speech/non-speech detection and an overlap detector for putative matches. We test the system proposed using the Mediaeval 2012 spoken-web-search dataset and show that, in addition to the memory savings, the proposed algorithm brings an advantage in terms of matching accuracy (up to 0.235 absolute MTWV increase) and speed (around 25% faster) in comparison to the original S-DTW.
         
        
            Keywords : 
Internet; audio databases; query processing; speech recognition; MES-DTW; Mediaeval 2012 spoken-Web-search dataset; QbE-STD; dynamic time warping; fixed start-end points; large scale database querying; memory efficient subsequence DTW; nonspeech detection; optimal matching subsequences; query-by-example spoken term detection; search collection; speech detection; spoken documents; spoken queries; Acoustics; Databases; Dynamic programming; Equations; Heuristic algorithms; Memory management; Speech; Dynamic Time Warping; dynamic programming; pattern matching; speech;
         
        
        
        
            Conference_Titel : 
Multimedia and Expo (ICME), 2013 IEEE International Conference on
         
        
            Conference_Location : 
San Jose, CA
         
        
        
        
            DOI : 
10.1109/ICME.2013.6607546