Title :
Perceptual audio features for unsupervised key-phrase detection
Author :
Von Zeddelmann, Dirk ; Kurth, Frank ; Müller, Meinard
Author_Institution :
KOM Dept., Fraunhofer-FKIE, Wachtberg, Germany
Abstract :
We propose a new type of audio feature (HFCC-ENS) as well as an unsupervised method for detecting short sequences of spoken words (key-phrases) within long speech recordings. Our technical contributions are threefold: Firstly, we propose to use bandwidth-adapted filterbanks instead of classical MFCC-style filters in the feature extraction step. Secondly, the time resolution of the resulting features is adapted to account for the temporal characteristics of the spoken phrases. Thirdly, the key-phrase detection step is performed by matching sequences of the resulting HFCC-ENS features with features extracted from a target speech recording. We evaluate the proposed method using the German Kiel Corpus and furthermore investigate speech-related properties of the proposed feature.
Keywords :
cepstral analysis; feature extraction; speech recognition; German kiel corpus; MFCC style filters; feature extraction step; perceptual audio features; speech recordings; spoken words sequences; unsupervised key phrase detection; Audio recording; Bandwidth; Feature extraction; Filters; Frequency; Hidden Markov models; Humans; Robustness; Speech processing; Statistics; HFCC; Speech features; key-phrase detection; key-phrase spotting;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495974