Adding controlled amount of noise to improve recognition of compressed and spectrally distorted speech

Author

Nouza, Jan ; Cerva, Petr ; Silovsky, Jan

Author_Institution

SpeechLab, Tech. Univ. of Liberec, Liberec, Czech Republic

fYear

2013

Firstpage

8046

Lastpage

8050

Abstract

This paper deals with the recognition of speech whose spectrum is notably distorted by lossy compression (namely MP3) or by some implementations of `speech enhancement´ techniques. We show that these non-linear treatments can introduce gaps in spectrum that significantly change the distribution of MFCCs and degrade performance of ASR. We propose a method that measures the level of spectrum distortion and use it for adding a controlled amount of noise to the signal. It effectively masks the gaps and helps namely in situations where the source and parameters of the distortion are not known and hence we cannot use a properly matched acoustic model. In spite of its simplicity, the method can improve significantly speech recognition of highly compressed or spectrally distorted signals. We demonstrate it in several large experiments conducted on publicly available speech databases, in two languages and for two types of spectral distortion.

Keywords

speech coding; speech enhancement; speech recognition; ASR; MFCC; MP3; acoustic model; compressed speech signal; nonlinear treatment; spectrally distorted signal; spectrum distortion; speech databases; speech enhancement technique; speech recognition; Abstracts; Acoustics; Face recognition; Noise; Speech; Speech coding; Speech recognition; MP3; compressed speech; speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location

Vancouver, BC

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2013.6639232

Filename

6639232