Hybrid acoustic models for distant and multichannel large vocabulary speech recognition

Author

Swietojanski, Pawel ; Ghoshal, Arnab ; Renals, Steve

Author_Institution

Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK

fYear

2013

fDate

8-12 Dec. 2013

Firstpage

285

Lastpage

290

Abstract

We investigate the application of deep neural network (DNN)-hidden Markov model (HMM) hybrid acoustic models for far-field speech recognition of meetings recorded using microphone arrays. We show that the hybrid models achieve significantly better accuracy than conventional systems based on Gaussian mixture models (GMMs). We observe up to 8% absolute word error rate (WER) reduction from a discriminatively trained GMM baseline when using a single distant microphone, and between 4-6% absolute WER reduction when using beamforming on various combinations of array channels. By training the networks on audio from multiple channels, we find the networks can recover significant part of accuracy difference between the single distant microphone and beamformed configurations. Finally, we show that the accuracy of a network recognising speech from a single distant microphone can approach that of a multi-microphone setup by training with data from other microphones.

Keywords

Gaussian processes; hidden Markov models; microphone arrays; neural nets; speech recognition; DNN; GMM; Gaussian mixture model; HMM; WER reduction; deep neural network; distant speech recognition; far field speech recognition; hidden Markov model; hybrid acoustic model; large vocabulary speech recognition; microphone array; multichannel speech recognition; single distant microphone; word error rate reduction; Acoustics; Array signal processing; Arrays; Microphones; Speech; Speech recognition; Training; Beamforming; Deep Neural Networks; Distant Speech Recognition; Meeting recognition; Microphone Arrays;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on

Conference_Location

Olomouc

Type

conf

DOI

10.1109/ASRU.2013.6707744

Filename

6707744