Title :
Recognition of overlapping speech using digital MEMS microphone arrays
Author :
Zwyssig, Erich ; Faubel, Friedrich ; Renals, Steve ; Lincoln, Mike
Author_Institution :
Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK
Abstract :
This paper presents a new corpus comprising single and overlapping speech recorded using digital MEMS and analogue microphone arrays. In addition to this, the paper presents results from speech separation and recognition experiments on this data. The corpus is a reproduction of the multi-channel Wall Street Journal audio-visual corpus (MC-WSJAV), containing recorded speech in both a meeting room and an anechoic chamber using two different microphone types as well as two different array geometries. The speech separation and speech recognition experiments were performed using SRP-PHAT-based speaker localisation, superdirective beamforming and multiple post-processing schemes, such as residual echo suppression and binary masking. Our simple, cMLLR-based recognition system matches the performance of state-of-the-art ASR systems on the single speaker task and outperforms them on overlapping speech. The corpus will be made publicly available via the LDC in spring 2013.
Keywords :
echo suppression; microphone arrays; speech recognition; SRP-PHAT-based speaker localisation; anechoic chamber; digital MEMS microphone arrays; multichannel Wall Street Journal audio-visual corpus; multiple post-processing schemes; overlapping speech recognition; speech separation; superdirective beamforming; Array signal processing; Arrays; Micromechanical devices; Microphone arrays; Speech; Speech recognition; ASR; MEMS microphones; WSJ; microphone array; speech separation;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639033