مرکز منطقه ای اطلاع رساني علوم و فناوري - Performance improvement of a bitstream-based front-end for wireless speech recognition in adverse environments

DocumentCode :

1133507

Title :

Performance improvement of a bitstream-based front-end for wireless speech recognition in adverse environments

Author :

Kim, Hong Kook ; Cox, Richard V. ; Rose, Richard C.

Author_Institution :

AT&T Labs.-Res., USA

Volume :

Issue :

fYear :

2002

fDate :

11/1/2002 12:00:00 AM

Firstpage :

591

Lastpage :

604

Abstract :

We propose a feature enhancement algorithm for wireless speech recognition in adverse acoustic environments. A speech recognition system is realized at the network side of a wireless communications system and feature parameters are extracted directly from the bitstream of the speech coder employed in the system, where the feature parameters are composed of spectral envelope information and coder-specific information. The coder-specific information is apt to be affected by environmental noise because the speech coder fails to generate high quality speech in noisy environments. We first found that enhancing noisy speech prior to speech coding improves the recognizer´s performance. However, our aim was to develop a robust front-end operating at the network side of a wireless communications system without regard to whether speech enhancement was applied at the sender side. We investigated the effect of a speech enhancement algorithm on the bitstream-based feature parameters. Consequently, a feature enhancement algorithm is proposed which incorporates feature parameters obtained from the decoded speech and a noise suppressed version of the decoded speech. The coder-specific information can also be improved by re-estimating the codebook gains and residual energy from the enhanced residual signal. HMM-based connected digit recognition experiments show that the proposed feature enhancement algorithm significantly improves recognition performance at low signal-to-noise ratio (SNR) without causing poorer performance at high SNR. From large vocabulary speech recognition experiments with far-field microphone speech signals recorded in an office environment, we show that the feature enhancement algorithm greatly improves word recognition accuracy.

Keywords :

hidden Markov models; radio networks; speech coding; speech enhancement; speech recognition; voice communication; HMM-based connected digit recognition; SNR; adverse acoustic environments; bitstream-based feature parameters; bitstream-based front-end; codebook gain; coder-specific information; decoded speech; environmental noise; far-field microphone speech signals; feature enhancement algorithm; feature parameters extraction; large vocabulary speech recognition; noisy environments; noisy speech enhancement; office environment; quality speech; residual energy; signal-to-noise ratio; spectral envelope information; speech coder; speech coding; speech enhancement algorithm; wireless communications networks; wireless speech recognition; word recognition accuracy; Data mining; Decoding; Feature extraction; Noise generators; Signal to noise ratio; Speech coding; Speech enhancement; Speech recognition; Wireless communication; Working environment noise;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/TSA.2002.804302

Filename :

1175531

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1133507