Title :
Spoken Document Retrieval Using Multilevel Knowledge and Semantic Verification
Author :
Huang, Chien-Lin ; Wu, Chung-Hsien
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan
Abstract :
This study presents a novel approach to spoken document retrieval based on multilevel knowledge indexing and semantic verification. Multilevel knowledge indexing considers three information sources, namely transcription data, keywords extracted from spoken documents, and hypernyms of the extracted keywords. A semantic network with forward-backward propagation is presented for semantic verification of the retrieved documents. In the forward step for semantic verification, a bag of keywords is chosen based on word significance measures. Semantic relations are estimated and adopted for verification in the backward procedure. The verification score is then utilized to weight and rerank the retrieved documents to obtain the final results. Experiments are performed on 40 h of anchor speech extracted from 198 h of collected broadcast news. Experimental results indicate that multilevel knowledge indexing and semantic verification achieve better retrieval results than other indexing schemes.
Keywords :
information retrieval; speech recognition; forward-backward propagation; multilevel knowledge indexing; semantic verification; speech recognition; spoken document retrieval; Broadcasting; Content based retrieval; Data mining; Frequency; History; Indexing; Information retrieval; Natural languages; Speech recognition; Text recognition; Multilevel knowledge; semantic verification; spoken document retrieval (SDR); spoken keyword extraction;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2007.907429