Title :
Enhanced MAP adaptation of n-gram language models using indirect correlation of distant words
Author :
Moriya, Takaaki ; Hirose, Keikichi ; Minematsu, Nobuaki ; Jiang, Hui
Author_Institution :
Graduate Sch. of Frontier Sci., Tokyo Univ., Japan
Abstract :
A novel and effective method to adapt n-gram language models to a new domain has been developed. We propose a heuristic method of language model adaptation using indirect correlation between words which are distant from each other, in addition to the conventional n-gram correlation, which represents only superficial and direct information of adjacent words. By adding the correlation of distant words, the adapted models come to include more information on the co-occurrence of words of a target domain and improve their performance for perplexity reduction. Furthermore, since the new correlation covers the indirect one not appearing in surface sentences, the adapted models still work well in domains somewhat different from the target domain. Experiments show that, in comparison with well-known MAP-based adaptation, the proposed method improves the performance of perplexity reduction by approximately 10% in the target domain and also in another domain.
Keywords :
correlation methods; linguistics; maximum likelihood estimation; natural languages; speech recognition; statistical analysis; MAP adaptation; adjacent words; distant words; estimation parameters; indirect correlation; n-gram language models; perplexity reduction; speech recognition; statistical modeling; Adaptation model; Electronic mail; Information science; Multimedia communication; Multimedia systems; Natural languages; Parameter estimation; Probability; Speech recognition; Vocabulary;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on
Print_ISBN :
0-7803-7343-X
DOI :
10.1109/ASRU.2001.1034668