DocumentCode :
2700702
Title :
Divergence-Based Similarity Measure for Spoken Document Retrieval
Author :
Peng Liu ; Soong, Frank K. ; Jian-Lai Thou
Author_Institution :
Microsoft Res. Asia, Beijing, China
Volume :
4
fYear :
2007
fDate :
15-20 April 2007
Abstract :
We propose a novel, divergence-based similarity measure for spoken document retrieval (SDR). We derive a dynamic programming algorithm that measures Kullback-Leibler divergence between two HMMs first. The measure is further generalized to a graph matching algorithm, which is efficient for SDR application. The proposed approach compares the underlying acoustic models of keywords and a target database to alleviate the impact of mismatched vocabulary and language model, e.g. different domains. Experimental results on the Wall Street Journal (WSJ) database show that the proposed approach achieves a comparable performance, compared with the word posterior based approach. It outperforms the latter when there is a mismatch in language model. The approach is promising for building an open-vocabulary, domain independent SDR application.
Keywords :
document handling; graph theory; hidden Markov models; information retrieval; mathematical programming; matrix algebra; speech processing; HMM; Kullback-Leibler divergence; divergence-based similarity measure; graph matching algorithm; programming algorithm; spoken document retrieval; word posterior based approach; Acoustic applications; Acoustic measurements; Asia; Costs; Databases; Dynamic programming; Hidden Markov models; Natural languages; Speech; Vocabulary; Dynamic programming; Hidden Markov models; Kullback-Leibler divergence; Spoken document retrieval;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
ISSN :
1520-6149
Print_ISBN :
1-4244-0727-3
Type :
conf
DOI :
10.1109/ICASSP.2007.367170
Filename :
4218044
Link To Document :
بازگشت