Title :
Speech recognition using voice-characteristic-dependent acoustic models
Author :
Suzuki, H. ; Zen, H. ; Nankaku, Yoshihiko ; Miyajima, C. ; Tokuda, K. ; Kitamura, Takamitsu
Author_Institution :
Dept. of Comput. Sci., Nagoya Inst. of Technol., Japan
Abstract :
This paper proposes a speech recognition technique based on acoustic models considering voice characteristic variations. Context-dependent acoustic models, which are typically triphone HMM, are often used in continuous speech recognition systems. This work hypothesizes that the speaker voice characteristics that humans can perceive by listening are also factors in acoustic variation for construction of acoustic models, and a tree-based clustering technique is also applied to speaker voice characteristics to construct voice-characteristic-dependent acoustic models. In speech recognition using triphone models, the neighboring phonetic context is given from the linguistic-phonetic knowledge in advance; in contrast, the voice characteristics of input speech are unknown in recognition using voice-characteristic-dependent acoustic models. This paper proposes a method of recognizing speech even under conditions where the voice characteristics of the input speech are unknown. The result of a gender-dependent speech recognition experiment shows that the proposed method achieves higher recognition performance in comparison to conventional methods.
Keywords :
decision trees; pattern clustering; speech recognition; acoustic variation; gender-dependent speech recognition; recognition performance; speaker voice characteristics; speech recognition; tree-based clustering; voice-characteristic-dependent acoustic models; Character recognition; Computer science; Context modeling; Decision trees; Decoding; Hidden Markov models; Humans; Loudspeakers; Speech recognition; Training data;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
Print_ISBN :
0-7803-7663-3
DOI :
10.1109/ICASSP.2003.1198887