DocumentCode :
542219
Title :
Enhanced posteriors bias prediction for robust multi-stream ASR combining voicing and estimate reliabilities
Author :
Glotin, Hervé
Author_Institution :
ERSS - CNRS, 5 all. Machado; Toulouse Cedex 1 - France
Volume :
1
fYear :
2002
fDate :
13-17 May 2002
Abstract :
We discuss the fusion of speech and phoneme estimate reliabilities in a multi-stream Automatic Speech Recognizer (ASR) to improve ASR robustness. The Full Combination approach (FC) proposes to decompose the full-band posterior probability for each phoneme into a reliability weighted sum of stream posteriors´ combinations. Previous studies show that weighting factors in FC should take in account not only speech signal reliability, but also the intrinsic efficiency of subband experts. To control these two variables for each combination of posteriors we derive a new model called “Posteriors Bias Prediction” (PBP) inspired by the Shannon Correction system. We show that FC is a specific type of PBP, and that PBP allows the integration of stream reliability based on of the voicing level R (Correlated with the Signal to Noise Ratio) and the phoneme´s class. Tests on telephonic free digits (Numbers95) under various noise and SNR level demonstrate that PBP- outperforms FC, Jrasta or Spectral Subtraction methods.
Keywords :
Adaptation model; Hidden Markov models; Robustness; Signal to noise ratio; Speech processing; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5743717
Filename :
5743717
Link To Document :
بازگشت