Title :
Cepstral compensation by polynomial approximation for environment-independent speech recognition
Author :
Raj, Biksha ; Gouvêa, Evandro B. ; Moreno, Pedro J. ; Stern, Richard M.
Author_Institution :
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
Speech recognition systems perform poorly on speech degraded by even simple effects such as linear filtering and additive noise. One possible solution to this problem is to modify the probability density function (PDF) of clean speech to account for the effects of the degradation. However, even for the case of linear filtering and additive noise, it is extremely difficult to do this analytically. Previously attempted analytical solutions to the problem of noisy speech recognition have either used an overly simplified mathematical description of the effects of noise on the statistics of speech, or they have relied on the availability of large environment specific adaptation sets. Some of the previous methods required the use of adaptation data that consists of simultaneously recorded or “stereo” recordings of clean and degraded speech. We introduce an approximation based method to compute the effects of the environment on the parameters of the PDF of clean speech. We perform compensation by vector polynomial approximations (VPS) for the effects of linear filtering and additive noise on the clean speech. We also estimate the parameters of the environment, namely the noise and the channel, by using piecewise linear approximations of these effects. We evaluate the performance of this method (VPS) using the CMU SPHINX-II system and the 100 word alphanumeric CENSUS database. Performance is evaluated at several SNRs, with artificial white Gaussian noise added to the database. VPS provides improvements of up to 15 percent in relative recognition accuracy
Keywords :
Gaussian noise; approximation theory; cepstral analysis; database management systems; piecewise-linear techniques; polynomials; speech enhancement; speech processing; speech recognition; 100 word alphanumeric CENSUS database; CMU SPHINX-II system; PDF; additive noise; approximation based method; artificial white Gaussian noise; cepstral compensation; clean speech; environment independent speech recognition; linear filtering; noisy speech recognition; piecewise linear approximations; polynomial approximation; probability density function; relative recognition accuracy; speech degradation; speech recognition systems; vector polynomial approximations; Additive noise; Cepstral analysis; Degradation; Maximum likelihood detection; Piecewise linear approximation; Polynomials; Speech analysis; Speech enhancement; Speech recognition; Working environment noise;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607277