Title :
A smoothing algorithm for the task adaptation Chinese Trigram model
Author :
Minghu, Jiang ; Baozong, Yuan ; Biqin, Lin ; Xiaofang, Tang
Author_Institution :
Inst. of Inf. Sci., Northern Jiaotong Univ., Beijing, China
Abstract :
The paper mainly solves two problems. A Chinese trigram model of task adaptation ability is set up. A zerogram to trigram probability statistics information base of the 1994 “People Daily” are built, making use of the successful experience of using HMM in speech recognition, and the adopted Baum-Welch algorithm for optimisation of the weights. Each weight stands for the correlation statistic reliability of these models. The probability statistics matrix smoothing algorithm of the parameter space is carried out, in order to offset the matrix sparse data of statistic probability. The “People Daily” corpus statistic results are regarded as the preliminary statistic results. When changing the application domain, the recognition accuracy rate of the preliminary statistic results decline, and we adopt “PC World” as the corpus of the changing domain and carry out successive training, then a second smoothing of the preliminary statistic results, and the successive statistic results are looked upon as the final results. A trigram model of task adaptation is obtained. The experimental results show this language model reduces the workload of successive training, and can effectively reduce the perplexity of language models in the task changing domain. It has a higher language adaptation ability in the task changing domain
Keywords :
hidden Markov models; optimisation; probability; sparse matrices; speech recognition; statistical analysis; Baum-Welch algorithm; Chinese trigram model; HMM; PC World; People Daily; correlation statistic reliability; language adaptation ability; matrix sparse data; probability statistics information base; smoothing algorithm; speech recognition; successive training; task adaptation; weight optimisation; Adaptation model; Entropy; Hidden Markov models; Information science; Natural languages; Probability; Smoothing methods; Sparse matrices; Speech recognition; Statistics;
Conference_Titel :
Signal Processing Proceedings, 1998. ICSP '98. 1998 Fourth International Conference on
Conference_Location :
Beijing
Print_ISBN :
0-7803-4325-5
DOI :
10.1109/ICOSP.1998.770317