Title :
Correcting posteriors by using a feedback synthesis loop in robust ASR
Author_Institution :
ERSS, Toulouse, France
Abstract :
Current Automatic Speech Recognition (ASR) systems are not efficient in noisy speech conditions. We propose a new strategy to reinforce ASR robustness, based on a feedback loop from recognition of posteriors to signal synthesis. The key idea is to use phonemes´ posteriors generated by recognition to calculate an acoustic image (AI) at each frame and to calculate its correlation with the input signal. AI is the weighted sum phonemes clean speech spectrum, where weights are directly taken as the corresponding phonemes´ posteriors. Correlation between AI and the input spectrum gives a Recognition Index (RI). We then show how a simple correction function of posteriors´ distribution using RI improves the Word Error Rate in a continuous speech recognition task compared to a state of the art ASR system (Jrasta).
Keywords :
feedback; image recognition; maximum likelihood estimation; signal synthesis; speech recognition; speech synthesis; AI recognition; ASR; RI; acoustic image recognition; automatic speech recognition index; feedback synthesis loop; posterior correction; signal synthesis; weighted sum phonemes; word error rate; Abstracts; Estimation; Hidden Markov models; Image segmentation; Noise; Noise measurement; Robustness;
Conference_Titel :
Signal Processing Conference, 2002 11th European
Conference_Location :
Toulouse