Title :
On the importance of phase in human speech recognition
Author :
Shi, Guangji ; Shanechi, Maryam Modir ; Aarabi, Parham
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Toronto, Ont.
Abstract :
In this paper, we analyze the effects of uncertainty in the phase of speech signals on the word recognition error rate of human listeners. The motivating goal is to get a quantitative measure on the importance of phase in automatic speech recognition by studying the effects of phase uncertainty on human perception. Listening tests were conducted for 18 listeners under different phase uncertainty and signal-to-noise ratio (SNR) conditions. These results indicate that a small amount of phase error or uncertainty does not affect the recognition rate, but a large amount of phase uncertainty significantly affects the recognition rate. The degree of the importance of phase also seems to be an SNR-dependent one, such that at lower SNRs the effects of phase uncertainty are more pronounced than at higher SNRs. For example, at an SNR of -10 dB, having random phases at all frequencies results in a word error rate (WER) of 63% compared to 24% if the phase was unaltered. In comparison, at 0 dB, random phase results in a 25% WER as compared to 11% for the unaltered phase case. Listening tests were also conducted for the case of reconstructed phase based on the least square error estimation approach. The results indicate that the recognition rate for the reconstructed phase case is very close to that of the perfect phase case (a WER difference of 4% on average)
Keywords :
error analysis; least squares approximations; speech recognition; automatic speech recognition; human listeners; human perception; human speech recognition; least square error estimation approach; phase uncertainty; signal-to-noise ratio; word recognition error rate; Automatic speech recognition; Error analysis; Humans; Phase measurement; Signal analysis; Signal to noise ratio; Speech analysis; Speech recognition; Testing; Uncertainty; Phase analysis; phase effect; phase reconstruction; speech recognition;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TSA.2005.858512