DocumentCode
394348
Title
Triphone model reconstruction for Mandarin pronunciation variations
Author
Fung, Pascale ; Yi, Liu
Author_Institution
Human Language Technol. Center, Univ. of Sci. & Technol., Hong Kong, China
Volume
1
fYear
2003
fDate
6-10 April 2003
Abstract
The high error rate of recognition accuracy in spontaneous speech is due in part to the poor modeling of pronunciations. In this paper, we propose modeling pronunciation variations through triphone model reconstruction. We first generate a partial change phone model (PCPM) to differentiate pronunciation variations. In order to improve the resolution of triphone models, PCPM is used as a hidden model and merged into the pre-trained acoustic model through model reconstruction. To avoid model confusion, auxiliary decision trees are established for triphone PCPM. The acoustic model reconstruction on triphones is equivalent to decision tree merging. The effectiveness of this approach is evaluated on the 1997 Hub4NE Mandarin Broadcast News Corpus (1997 MBN) with different styles of speech. It gives a significant 2.39% absolute syllable error rate reduction in spontaneous speech.
Keywords
decision trees; error statistics; hidden Markov models; speech processing; speech recognition; 1997 Hub4NE Mandarin Broadcast News Corpus; 1997 MBN; Mandarin pronunciation variations; PCPM; auxiliary decision trees; hidden Markov model; partial change phone model; pre-trained acoustic model; recognition accuracy; spontaneous speech; syllable error rate reduction; triphone model reconstruction; Broadcasting; Computational efficiency; Decision trees; Error analysis; Hidden Markov models; Humans; Merging; Natural languages; Speech analysis; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-7663-3
Type
conf
DOI
10.1109/ICASSP.2003.1198892
Filename
1198892
Link To Document