DocumentCode :
2702936
Title :
Latent Prosody Model of Continuous Mandarin Speech
Author :
Chen-Yu Chiang ; Xiao-Dong Wang ; Yuan-Fu Liao ; Yih-Ru Wang ; Sin-Horng Chen ; Hirose, Keikichi
Author_Institution :
Dept. of Commun. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Volume :
4
fYear :
2007
fDate :
15-20 April 2007
Abstract :
The major difficulty of prosody modeling and automatic tone recognition of continuous Mandarin speech is the complex interaction of tones and prosody/intonation on FO contours. In this study, we propose a latent prosody model (LPM) aiming to jointly model the affections of tone and prosody state on FO. The main purposes are twofold including (1) automatic prosody state labeling and (2) improving tone recognition accuracy. The basic idea is to introduce latent prosody state variables into an additive statistic model of FO which already considers the affecting factors of tone and speaker. Experiments on the Tree-Bank corpus showed that LPM not only gave meaningful prosody state labeling results but also improved the average tone recognition rate from 80.86% of a multi-layer perceptron (MLP) baseline to 82.55%.
Keywords :
multilayer perceptrons; speech processing; speech recognition; Tree-Bank corpus; additive statistic model; automatic prosody state labeling; automatic tone recognition; continuous Mandarin speech; latent prosody model; multi-layer perceptron; Automatic speech recognition; Context modeling; Gaussian distribution; Labeling; Maximum likelihood detection; Multilayer perceptrons; Natural languages; Recurrent neural networks; Speech recognition; Statistics; speech processing; speech recognition; tone recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
ISSN :
1520-6149
Print_ISBN :
1-4244-0727-3
Type :
conf
DOI :
10.1109/ICASSP.2007.366990
Filename :
4218178
Link To Document :
بازگشت