Title : 
Efficient spoken term detection using confusion networks
         
        
            Author : 
Mangu, Lidia ; Kingsbury, Brian ; Soltau, Hagen ; Hong-Kwang Kuo ; Picheny, Michael
         
        
            Author_Institution : 
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
         
        
        
        
        
        
            Abstract : 
In this paper, we present a fast, vocabulary independent algorithm for spoken term detection (STD) that demonstrates a word-based index is sufficient to achieve good performance for both in-vocabulary (IV) and out-of-vocabulary (OOV) terms. Previous approaches have required that a separate index be built at the sub-word level and then expanded to allow for matching OOV terms. Such a process, while accurate, is expensive in both time and memory. In the proposed architecture, a word-level confusion network (CN) based index is used for both IV and OOV search. This is implemented using a flexible WFST framework. Comparisons on 3 Babel languages (Tagalog, Pashto and Turkish) show that CN-based indexing results in better performance compared with the lattice approach while being orders of magnitude faster and having a much smaller footprint.
         
        
            Keywords : 
speech processing; vocabulary; 3 Babel language; CN-based indexing; IV term; OOV term; Pashto; STD; Tagalog; Turkish; flexible WFST framework; in-vocabulary term; out-of-vocabulary term; spoken term detection; vocabulary independent algorithm; word-based index; word-level confusion network; Acoustics; Indexing; Lattices; Speech; Transducers; Vocabulary; audio indexing; confusion networks; keyword search; keyword spotting; spoken term detection;
         
        
        
        
            Conference_Titel : 
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
         
        
            Conference_Location : 
Florence
         
        
        
            DOI : 
10.1109/ICASSP.2014.6855127