Title :
Automatic keyword selection for keyword search development and tuning
Author :
Jia Cui ; Mamou, Jonathan ; Kingsbury, Brian ; Ramabhadran, Bhuvana
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
In this paper, we investigate the problem of automatically selecting textual keywords for keyword search development and tuning on audio data for any language. Briefly, the method samples candidate keywords in the training data while trying to match a set of target marginal distributions for keyword features such as keyword frequency in the training or development audio, keyword length, frequency of out-of-vocabulary words, and TF-IDF scores. The method is evaluated on four IARPA Babel program base period languages. We show the use of the automatically selected keywords for the keyword search system development and tuning. We show also that search performance is improved by tuning the decision threshold on the automatically selected keywords.
Keywords :
audio signal processing; natural language processing; query processing; speech processing; speech recognition; vocabulary; IARPA Babel program base period languages; TF-IDF scores; audio data; automatic textual keyword selection; development audio; keyword features; keyword frequency; keyword length; keyword search development; keyword search turning; out-of-vocabulary word frequency; query selection; spoken term detection; target marginal distributions; training audio; training data; Acoustics; Keyword search; NIST; Speech; Training; Training data; Tuning; keyword search; keyword selection; query selection; spoken term detection;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6855126