Title :
Dynamic language model adaptation using latent topical information and automatic transcripts
Author_Institution :
Graduate Inst. of Comput. Sci. & Inf. Eng., Nat. Taiwan Normal Univ., Taipei, Taiwan
Abstract :
This paper considers dynamic language model adaptation for Mandarin broadcast news recognition. Both contemporary newswire texts and in-domain automatic transcripts were exploited in language model adaptation. A topical mixture model was presented to dynamically explore the long-span latent topical information for language model adaptation. The underlying characteristics and different kinds of model structures were extensively investigated, while their performance was analyzed and verified by comparison with the conventional MAP-based adaptation approaches, which are devoted to extracting the short-span n-gram information. The fusion of global topical and local contextual information was investigated as well. The speech recognition experiments were conducted on the broadcast news collected in Taiwan. Very promising results in perplexity as well as character error rate reductions were initially obtained.
Keywords :
broadcasting; natural languages; speech recognition; Mandarin broadcast news recognition; Taiwan; character error rate; contemporary newswire text; dynamic language model adaptation; in-domain automatic transcript; long-span latent topical information; short-span n-gram information; speech recognition; Adaptation model; History; Interpolation; Large scale integration; Maximum likelihood estimation; Natural languages; Probability; Radio broadcasting; Speech recognition; TV broadcasting;
Conference_Titel :
Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on
Print_ISBN :
0-7803-9331-7
DOI :
10.1109/ICME.2005.1521369