DocumentCode
353510
Title
A new phonetic tied-mixture model for efficient decoding
Author
Lee, Akinobu ; Kawahara, Tatsuya ; Takeda, Kazuya ; Shikano, Kiyohiro
Author_Institution
Kyoto Univ., Japan
Volume
3
fYear
2000
fDate
2000
Firstpage
1269
Abstract
A phonetic tied-mixture (PTM) model for efficient large vocabulary continuous speech recognition is presented. It is synthesized from context-independent phone models with 64 mixture components per state by assigning different mixture weights according to the shared states of triphones. Mixtures are then re-estimated for optimization. The model achieves a word error rate of 7.0% with a 20000-word dictation of newspaper corpus, which is comparable to the best figure by the triphone of much higher resolutions. Compared with conventional PTMs that share Gaussians by all states, the proposed model is easily trained and reliably estimated. Furthermore, the model enables the decoder to perform efficient Gaussian pruning. It is found out that computing only two out of 64 components does not cause any loss of accuracy. Several methods for the pruning are proposed and compared, and the best one reduced the computation to about 20%
Keywords
Gaussian distribution; decoding; optimisation; speech coding; speech recognition; Gaussian pruning; PTM; context-independent phone models; efficient decoding; large vocabulary continuous speech recognition; mixture weights; optimization; phonetic tied-mixture model; triphones; word error rate; Context modeling; Decoding; Error analysis; Gaussian distribution; Gaussian processes; Hidden Markov models; Speech recognition; Speech synthesis; State estimation; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location
Istanbul
ISSN
1520-6149
Print_ISBN
0-7803-6293-4
Type
conf
DOI
10.1109/ICASSP.2000.861808
Filename
861808
Link To Document