مرکز منطقه ای اطلاع رساني علوم و فناوري - Deep convolutional nets and robust features for reverberation-robust speech recognition

DocumentCode :

3585086

Title :

Deep convolutional nets and robust features for reverberation-robust speech recognition

Author :

Mitra, Vikramjit ; Wen Wang ; Franco, Horacio

Author_Institution :

Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA

fYear :

2014

Firstpage :

548

Lastpage :

553

Abstract :

While human listeners can understand speech in reverberant conditions, indicating that the auditory system is robust to such degradations, reverberation leads to high word error rates for automatic speech recognition (ASR) systems. In this work, we present robust acoustic features motivated by human speech perception for use in a convolutional deep neural network (CDNN)-based acoustic model for recognizing continuous speech in a reverberant condition. Using a single-feature system trained with the single channel data distributed through the REVERB 2014 challenge on ASR in reverberant conditions, we show a substantial relative reduction in word error rates (WERs) compared to the conventional filterbank energy-based features for single-channel simulated and real reverberation conditions. The reduction is more pronounced when multiple features and systems were combined together. The proposed system outperforms the best system reported in REVERB-2014 challenge in single channel full-batch processing task.

Keywords :

channel bank filters; neural nets; reverberation; speech recognition; ASR system; CDNN-based acoustic model; REVERB 2014; WER; automatic speech recognition system; convolutional deep neural network-based acoustic model; filterbank energy-based features; human speech perception; reverberation-robust speech recognition; single-feature system; word error rates; Hidden Markov models; Reverberation; Robustness; Speech; Speech recognition; Training; deep convolutional networks; feature combination; reverberation robustness; robust features; robust speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Spoken Language Technology Workshop (SLT), 2014 IEEE

Type :

conf

DOI :

10.1109/SLT.2014.7078633

Filename :

7078633

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3585086