Clean speech reconstruction from noisy mel-frequency cepstral coefficients using a sinusoidal model

Author

Shao, Xu ; Milner, Ben

Author_Institution

Sch. of Inf. Syst., East Anglia Univ., Norwich, UK

Volume

1

fYear

2003

fDate

6-10 April 2003

Abstract

This paper extends the technique of speech reconstruction from MFCC by considering the effect of noisy speech. To reconstruct a clean speech signal from noise contaminated MFCC an estimate of the clean mel-filterbank vector is required together with a robust estimate of the pitch. This work applies spectral subtraction to the mel-filterbank vector (derived from noisy MFCC) to provide a clean speech spectral estimate. To obtain a reliable estimate of pitch a robust extraction technique is used. Spectrograms and informal listening tests reveal that a clean speech signal can be successfully reconstructed from the noisy MFCC. Pitch errors are shown to manifest themselves as artificial sounding bursts in the reconstructed speech signal. Incorrect estimates of the spectral envelope introduce periods of noise into the reconstructed speech.

Keywords

cepstral analysis; channel bank filters; frequency estimation; signal reconstruction; speech recognition; MFCC; artificial sounding bursts; clean mel-filterbank vector; clean speech reconstruction; informal listening tests; noisy mel-frequency cepstral coefficients; pitch errors; robust extraction technique; robust pitch estimate; sinusoidal model; spectral envelope; spectral estimate; spectral subtraction; spectrograms; Acoustic noise; Cepstral analysis; Mel frequency cepstral coefficient; Noise robustness; Oral communication; Speech codecs; Speech enhancement; Speech processing; Speech recognition; Telecommunication standards;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-7663-3

Type

conf

DOI

10.1109/ICASSP.2003.1198878

Filename

1198878