Title :
Chinese spoken document retrieval based on syllable neighbor posterior probability matrix
Author :
Zheng, Tieran ; Han, Jiqing
Author_Institution :
Coll. of Comput. Sci., Harbin Inst. of Technol., Harbin
Abstract :
Syllable lattice based Chinese spoken document retrieval methods can avoid the problem of out of vocabulary (OOV) words and compensate retrieval performance loss resulted by recognition error. For absence of effective retrieval model in lattice based retrieval approaches, a syllable neighbor posterior probability matrix based retrieval model is proposed in this paper. The model considers adequately special structure of lattice and introduces posterior probability to measure similarity between document and query. A retrieval method based on the model is given. It is proven by a series of experiments that our method is more suitable for the retrieval tasks with large scale corpus.
Keywords :
content-based retrieval; natural language processing; probability; speech recognition; Chinese spoken document retrieval; lattice based retrieval; lattice structure; out of vocabulary word problem; speech recognition; syllable lattice; syllable neighbor posterior probability matrix; Automatic speech recognition; Computer science; Educational institutions; Large-scale systems; Lattices; Size control; Space technology; Speech recognition; Text recognition; Vocabulary;
Conference_Titel :
Audio, Language and Image Processing, 2008. ICALIP 2008. International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-1723-0
Electronic_ISBN :
978-1-4244-1724-7
DOI :
10.1109/ICALIP.2008.4590205