DocumentCode :
3585068
Title :
Combining local and broad topic context to improve term detection
Author :
Wintrode, Jonathan ; Khudanpur, Sanjeev
Author_Institution :
Center for Language & Speech Process., Johns Hopkins Univ., Balitmore, MD, USA
fYear :
2014
Firstpage :
442
Lastpage :
447
Abstract :
We aim to improve term detection performance by augmenting traditional N-gram language models with multiple levels of topic context. We demonstrate that incorporating complementary aspects of topicality leads to significant improvements in term detection accuracy. We represent broad topic context through document-specific latent topics inferred via a Bayesian topic model. We capture local topic context with a cache-based adaptive language model. Measured on four languages from from the IARPA Babel program, interpolating unigrams from the broad topic context improves term detection performance by up to 1% absolute via lattice re-scoring. Re-decoding with the same document-specific model improves accuracy by up to 2.1%. Adding local context via cached N-grams improves performance by up to 1.6%. A combined approach, re-decoding with latent topic information then re-scoring with the local cached N-grams gives an overall improvement of up to 2.4%. For all languages, combining broad and local topic information outperforms any individual method.
Keywords :
document handling; inference mechanisms; information retrieval; interpolation; speech recognition; Bayesian topic model; IARPA Babel program; N-gram language models; broad topic context; cache-based adaptive language model; document-specific latent topics; lattice rescoring; local cached N-grams; local topic context; speech recognition; speech retrieval; term detection performance improvement; unigram interpolation; Adaptation models; Computational modeling; Context; Context modeling; Hidden Markov models; Lattices; Mathematical model; Speech recognition; language models; speech retrieval; spoken term detection; topic models;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2014 IEEE
Type :
conf
DOI :
10.1109/SLT.2014.7078615
Filename :
7078615
Link To Document :
بازگشت