Title :
Robust digit recognition using phase-dependent time-frequency masking
Author :
Shi, Guangji ; Aarabi, Parham
Author_Institution :
Dept. of Electr. & Comput. Eng., Toronto Univ., Ont., Canada
Abstract :
A technique using the time-frequency phase information of two microphones is proposed to estimate an ideal time-frequency mask using time-delay-of-arrival (TDOA) of the signal of interest. At a signal-to-noise ratio (SNR) of 0 dB, the proposed technique using two microphones achieves a digit recognition rate (average over 5 speakers, each speaking 20-30 digits) of 71%. In contrast, delay-and-sum beamforming only achieves a 40% recognition rate with two microphones and 60% with four microphones. Superdirective beamforming achieves a 44% recognition rate with two microphones and 65% with four microphones.
Keywords :
Gaussian noise; acoustic noise; array signal processing; parameter estimation; speech enhancement; speech recognition; time-frequency analysis; Gaussian noise; SNR; delay-and-sum beamforming; digit recognition; phase-dependent time-frequency masking; reverberation; signal-to-noise ratio; speech enhancement; speech noise; superdirective beamforming; time-delay-of-arrival; time-frequency phase information; Array signal processing; Delay; Frequency domain analysis; Gaussian noise; Independent component analysis; Microphones; Robustness; Speech enhancement; Speech recognition; Time frequency analysis;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
Print_ISBN :
0-7803-7663-3
DOI :
10.1109/ICASSP.2003.1198873