Title :
Dynamic Features in the Linear-Logarithmic Hybrid Domain for Automatic Speech Recognition in a Reverberant Environment
Author :
Ichikawa, Osamu ; Fukuda, Takashi ; Nishimura, Masafumi
Author_Institution :
IBM Res. - Tokyo, Yamato, Japan
Abstract :
Static and dynamic features using Mel frequency cepstral coefficients (MFCCs) are widely used in automatic speech recognition. Since the MFCCs are calculated from logarithmic spectra, the delta and delta-delta are considered to be difference operations in the logarithmic domain. In a reverberant environment, speech signals have late reverberations, whose power is plotted as a long-term exponential decay. This tends to cause the logarithmic delta to keep the constant value for a long time. This paper considers new schemes for calculating delta and delta-delta features that quickly diminish in the reverberant segments. Experiments using the evaluation framework for reverberant environments (CENSREC-4) showed significant improvements by simply replacing the MFCC dynamic features with the proposed dynamic features.
Keywords :
cepstral analysis; reverberation; speech recognition; CENSREC-4; MFCC; Mel frequency cepstral coefficients; automatic speech recognition; linear logarithmic hybrid domain; logarithmic delta-delta speech features; speech signal reverberations; Automatic speech recognition; Cepstral analysis; Discrete cosine transforms; Hidden Markov models; Mel frequency cepstral coefficient; Microphone arrays; Noise cancellation; Reverberation; Robustness; Transfer functions; Delta; Mel frequency cepstral coefficient (MFCC); dynamic feature; feature extraction; reverberation; robustness; speech recognition;
Journal_Title :
Selected Topics in Signal Processing, IEEE Journal of
DOI :
10.1109/JSTSP.2010.2057191