DocumentCode :
1161388
Title :
On the importance of phase in human speech recognition
Author :
Shi, Guangji ; Shanechi, Maryam Modir ; Aarabi, Parham
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Toronto, Ont.
Volume :
14
Issue :
5
fYear :
2006
Firstpage :
1867
Lastpage :
1874
Abstract :
In this paper, we analyze the effects of uncertainty in the phase of speech signals on the word recognition error rate of human listeners. The motivating goal is to get a quantitative measure on the importance of phase in automatic speech recognition by studying the effects of phase uncertainty on human perception. Listening tests were conducted for 18 listeners under different phase uncertainty and signal-to-noise ratio (SNR) conditions. These results indicate that a small amount of phase error or uncertainty does not affect the recognition rate, but a large amount of phase uncertainty significantly affects the recognition rate. The degree of the importance of phase also seems to be an SNR-dependent one, such that at lower SNRs the effects of phase uncertainty are more pronounced than at higher SNRs. For example, at an SNR of -10 dB, having random phases at all frequencies results in a word error rate (WER) of 63% compared to 24% if the phase was unaltered. In comparison, at 0 dB, random phase results in a 25% WER as compared to 11% for the unaltered phase case. Listening tests were also conducted for the case of reconstructed phase based on the least square error estimation approach. The results indicate that the recognition rate for the reconstructed phase case is very close to that of the perfect phase case (a WER difference of 4% on average)
Keywords :
error analysis; least squares approximations; speech recognition; automatic speech recognition; human listeners; human perception; human speech recognition; least square error estimation approach; phase uncertainty; signal-to-noise ratio; word recognition error rate; Automatic speech recognition; Error analysis; Humans; Phase measurement; Signal analysis; Signal to noise ratio; Speech analysis; Speech recognition; Testing; Uncertainty; Phase analysis; phase effect; phase reconstruction; speech recognition;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TSA.2005.858512
Filename :
1678004
Link To Document :
بازگشت