DocumentCode :
3527725
Title :
Latent topic modelling of word co-occurence information for spoken document retrieval
Author :
Chen, Berlin
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Normal Univ., Taipei
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
3961
Lastpage :
3964
Abstract :
In this paper, we present a word topic model (WTM) approach, discovering the co-occurrence relationship between words as well as the long-span latent topic information, for spoken document retrieval (SDR). A given document as a whole is modeled as a composite WTM model for generating an observed query. The underlying characteristics and different kinds of model structures are extensively investigated, while the performance of WTM is thoroughly analyzed and verified by comparison with a few existing retrieval models on the TDT-2 SDR task. We also attempt to incorporate part-of-speech (POS) weighting into the representations of the query observations and the WTM models for obtaining better retrieval performance.
Keywords :
probability; query processing; speech recognition; latent topic modelling; part-of-speech; probabilistic latent semantic analysis; query processing; speech recognition; spoken document retrieval; word co-occurence information; word topic model approach; Computer science; Frequency; Hidden Markov models; Indexing; Information retrieval; Natural languages; Performance analysis; Predictive models; Robustness; Speech processing; language model; probabilistic latent semantic analysis; spoken document retrieval; word topic model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4960495
Filename :
4960495
Link To Document :
بازگشت