Title :
Improving ASR by integrating lecture audio and slides
Author :
Miranda, Joao ; Neto, Joao Paulo ; Black, Alan W.
Author_Institution :
INESC-ID / Inst. Super. Tecnico, Lisbon, Portugal
Abstract :
We propose a method to combine audio of a lecture with its supporting slides in order to improve automatic speech recognition performance. We view both the lecture speech and the slides as parallel streams which contain redundant information. We integrate both streams in order to bias the recognizer´s language model towards the words in the slides, by first aligning the speech with the slide words, thus correcting errors on the ASR transcripts. We obtain a 5.9% relative WER improvement on a lecture test set, when compared to a speech recognition only system.
Keywords :
speech recognition; ASR transcript; automatic speech recognition; integrating lecture audio; language model; lecture speech; lecture test set; parallel stream slide; relative WER improvement; Computational modeling; Computer science; Data models; Lattices; Multimedia communication; Speech; Speech recognition; Lecture; Slides; Speech Recognition; System Combination;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639249