DocumentCode :
3570392
Title :
Robust speech separation using time-frequency masking
Author :
Aarabi, Parham ; Shi, Guangji ; Jahromi, Omid
Author_Institution :
Artificial Perception Lab., Toronto Univ., Ont., Canada
Volume :
1
fYear :
2003
Abstract :
A multi-microphone time-frequency speech masking technique is proposed. This technique utilizes both the time-frequency magnitude and phase information in order to estimate the signal-to-noise ratio (SNR) maximizing masking coefficients for each time-frequency block given that the direction (or alternatively, the time-delay of arrival) of the speaker of interest is known. Using this masking algorithm, speech features (such as formants) from the direction of interest are preserved while features from other directions are severely degraded. Digit recognition experiments indicate that the proposed technique can result in a substantial increase in the digit recognition accuracy rate. At 0 dB, for example, the proposed technique results in a digit recognition accuracy rate improvement of 26% over the single microphone case and an improvement of 12% over the two microphone superdirective beamforming case.
Keywords :
speech intelligibility; speech recognition; digit recognition experiments; signal-to-noise ratio; speech masking technique; speech separation; time-frequency masking; Fourier transforms; Independent component analysis; Integrated circuit modeling; Integrated circuit noise; Laboratories; Microphones; Robustness; Signal to noise ratio; Speech coding; Time frequency analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
Print_ISBN :
0-7803-7965-9
Type :
conf
DOI :
10.1109/ICME.2003.1221024
Filename :
1221024
Link To Document :
بازگشت