DocumentCode
290051
Title
Approaches to topic identification on the switchboard corpus
Author
McDonough, J. ; Ng, K. ; Jeanrenaud, P. ; Gish, H. ; Rohlicek, J.R.
Author_Institution
BNN Syst. & Technol., Cambridge, MA, USA
Volume
i
fYear
1994
fDate
19-22 Apr 1994
Abstract
Topic identification (TID) is the automatic classification of speech messages into one of a known set of possible topics. The TID task can be view as having three principal components: 1) event generation, 2) keyword event selection, and 3) topic modeling. Using data from the Switchboard corpus, the authors present experimental results for various approaches to the TID problem and compare the relative effectiveness of each. In addition, they examine the effect of keyword set size on identification accuracy and gauge the loss in performance when mismatched topic modeling and keyword selection schemes are used
Keywords
identification; speech processing; speech recognition; TID problem; automatic classification; event generation; identification accuracy; keyword event selection; keyword selection schemes; keyword set size; performance; speech messages; switchboard corpus; topic identification; topic modeling; Air traffic control; Data mining; Event detection; Feature extraction; Hidden Markov models; Natural languages; Performance loss; Speech recognition; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location
Adelaide, SA
ISSN
1520-6149
Print_ISBN
0-7803-1775-0
Type
conf
DOI
10.1109/ICASSP.1994.389275
Filename
389275
Link To Document