DocumentCode :
2770446
Title :
Topic identification from audio recordings using word and phone recognition lattices
Author :
Hazen, Timothy J. ; Richardson, Fred ; Margolis, Anna
Author_Institution :
MIT Lincoln Lab., Lexington
fYear :
2007
fDate :
9-13 Dec. 2007
Firstpage :
659
Lastpage :
664
Abstract :
In this paper, we investigate the problem of topic identification from audio documents using features extracted from speech recognition lattices. We are particularly interested in the difficult case where the training material is minimally annotated with only topic labels. Under this scenario, the lexical knowledge that is useful for topic identification may not be available, and automatic methods for extracting linguistic knowledge useful for distinguishing between topics must be relied upon. Towards this goal we investigate the problem of topic identification on conversational telephone speech from the Fisher corpus under a variety of increasingly difficult constraints. We contrast the performance of systems that have knowledge of the lexical units present in the audio data, against systems that rely entirely on phonetic processing.
Keywords :
audio recording; audio signal processing; computational linguistics; feature extraction; speech recognition; word processing; Fisher corpus; audio document; audio recording; feature extraction; linguistic knowledge; phone recognition lattice; speech recognition; topic identification; word recognition; Audio recording; Automatic speech recognition; Degradation; Feature extraction; Laboratories; Lattices; Speech recognition; Support vector machines; Telephony; Vocabulary; Audio document processing; topic identification; topic spotting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-1746-9
Electronic_ISBN :
978-1-4244-1746-9
Type :
conf
DOI :
10.1109/ASRU.2007.4430190
Filename :
4430190
Link To Document :
بازگشت