DocumentCode :
341679
Title :
Improving acoustic models with captioned multimedia speech
Author :
Jang, Photina Jaeyun ; Hauptmann, Alexander G.
Author_Institution :
Dept. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
Volume :
2
fYear :
1999
fDate :
36342
Firstpage :
767
Abstract :
Speech recognition can be used to create searchable transcripts for audio indexing in digital video libraries. Large amounts of hand-transcribed speech training data are required to build or improve acoustic models of highly accurate speech recognition systems using current technologies. We present a technique to use television broadcasts with closed-captions as a source for large amounts of automatically extracted and accurately transcribed speech for improving acoustic models. The errorful closed captioned text is aligned with the also errorful speech recognition output and matching segments are used with each corresponding audio segment as acoustic training data to improve the speech recognition system. Our technique automatically extracted 131.4 hours of transcribed speech and improved the word error rate of our currently best speech recognition system (Sphinx-III) from 32.82% to 31.19%. A speech recognizer trained exclusively on 70.7 hours of this automatically transcribed speech produced a word error rate of 32.7%
Keywords :
indexing; multimedia systems; speech recognition; video databases; Sphinx-III; acoustic models; audio indexing; captioned multimedia speech; digital video libraries; hand-transcribed speech training data; matching segments; searchable transcripts; speech recognition; television broadcasts; Data mining; Digital audio broadcasting; Digital video broadcasting; Error analysis; Indexing; Multimedia communication; Software libraries; Speech recognition; TV; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia Computing and Systems, 1999. IEEE International Conference on
Conference_Location :
Florence
Print_ISBN :
0-7695-0253-9
Type :
conf
DOI :
10.1109/MMCS.1999.778582
Filename :
778582
Link To Document :
بازگشت