Title :
Deep convolutional nets and robust features for reverberation-robust speech recognition
Author :
Mitra, Vikramjit ; Wen Wang ; Franco, Horacio
Author_Institution :
Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
Abstract :
While human listeners can understand speech in reverberant conditions, indicating that the auditory system is robust to such degradations, reverberation leads to high word error rates for automatic speech recognition (ASR) systems. In this work, we present robust acoustic features motivated by human speech perception for use in a convolutional deep neural network (CDNN)-based acoustic model for recognizing continuous speech in a reverberant condition. Using a single-feature system trained with the single channel data distributed through the REVERB 2014 challenge on ASR in reverberant conditions, we show a substantial relative reduction in word error rates (WERs) compared to the conventional filterbank energy-based features for single-channel simulated and real reverberation conditions. The reduction is more pronounced when multiple features and systems were combined together. The proposed system outperforms the best system reported in REVERB-2014 challenge in single channel full-batch processing task.
Keywords :
channel bank filters; neural nets; reverberation; speech recognition; ASR system; CDNN-based acoustic model; REVERB 2014; WER; automatic speech recognition system; convolutional deep neural network-based acoustic model; filterbank energy-based features; human speech perception; reverberation-robust speech recognition; single-feature system; word error rates; Hidden Markov models; Reverberation; Robustness; Speech; Speech recognition; Training; deep convolutional networks; feature combination; reverberation robustness; robust features; robust speech recognition;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2014 IEEE
DOI :
10.1109/SLT.2014.7078633