Title :
Latent Semantic Rational Kernels for Topic Spotting on Conversational Speech
Author :
Chao Weng ; Thomson, David L. ; Haffner, Patrick ; Juang, Biing-Hwang Fred
Author_Institution :
Dept. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
Abstract :
In this work, we propose latent semantic rational kernels (LSRK) for topic spotting on conversational speech. Rather than mapping the input weighted finite-state transducers (WFSTs) onto a high dimensional n-gram feature space as in n-gram rational kernels, the proposed LSRK maps the WFSTs onto a latent semantic space. With the proposed LSRK, all available external knowledge and techniques can be flexibly integrated into a unified WFST based framework to boost the topic spotting performance. We present how to generalize the LSRK using tf-idf weighting, latent semantic analysis, WordNet and probabilistic topic models. To validate the proposed LSRK framework, we conduct the topic spotting experiments on two datasets, Switchboard and AT&T HMIHY0300 initial collection. The experimental results show that with the proposed LSRK we can achieve significant and consistent topic spotting performance gains over the n-gram rational kernels.
Keywords :
information analysis; probability; speech recognition; AT&T HMIHY0300 initial collection; LSRK; WFST; WordNet; conversational speech; dimensional n-gram feature space; input weighted finite-state transducers; latent semantic analysis; latent semantic rational kernels; latent semantic space; n-gram rational kernels; probabilistic topic models; switchboard; topic spotting; Kernel; Probabilistic logic; Semantics; Speech; Speech processing; Transducers; Vectors; LDA; LSA; PLSA; WFSTs; rational kernels; tf-idf; topic spotting;
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
DOI :
10.1109/TASLP.2014.2347133