Title :
A new syllable-lattice based approach for Mandarin spoken document retrieval
Author :
Zhang, Lei ; Gao, Yunxia ; Xiang, Xuezhi ; Lu, Dong
Author_Institution :
Coll. of Inf. & Commun. Eng., Harbin Eng. Univ., Harbin, China
Abstract :
In our Mandarin spoken document retrieval system, the effects of both retrieval source and retrieval model are considered. For the retrieval source, the syllable-lattice is adopted which can ameliorate the effect of speech recognition error on document retrieval. For the retrieval model, the document length prior is combined with Jelinek-Mercer smoothing technique, which is widely applied in text document retrieval model. As far as we know, the combination of syllable lattice and retrieval model based on the document length prior is firstly introduced for spoken document retrieval. Experimental results show that the retrieval performance of lattice-based method outperforms that of 1-best method. Further more, in the retrieval model with the document length priors, lattice-based approach can achieve the best performance, which can improve about 30%.
Keywords :
information retrieval; speech recognition; Jelinek-Mercer smoothing technique; Mandarin spoken document retrieval; document length prior; retrieval source; speech recognition error; syllable-lattice based approach; text document retrieval model; Broadcasting; Decoding; Educational institutions; Hidden Markov models; Information retrieval; Lattices; Natural languages; Search engines; Smoothing methods; Speech recognition; spoken document retrieval; syllable-lattice; the documen length priors;
Conference_Titel :
Wireless Communications & Signal Processing, 2009. WCSP 2009. International Conference on
Conference_Location :
Nanjing
Print_ISBN :
978-1-4244-4856-2
Electronic_ISBN :
978-1-4244-5668-0
DOI :
10.1109/WCSP.2009.5371545