DocumentCode :
542291
Title :
Towards automatic corpus preparation for a German broadcast news transcription system
Author :
Macherey, Wolfgang ; Ney, Hermann
Author_Institution :
Lehrstuhl für Informatik VI, Computer Science Department, RWTH Aachen - University of Technology, 52056, Germany
Volume :
1
fYear :
2002
fDate :
13-17 May 2002
Abstract :
When setting up a speech recognition system for a new domain, a lot of manual effort is spent on corpus preparation, i.e., data acquisition, cutting and segmentation of the audio material, generation of pronunciation lexica, as well as the definition of suitable training and test sets. In this paper we describe several methods that help to automate and thus to speed up this procedure. For this purpose, we assume that only a preliminary, partially incorrect textual transcription is available. The effectivity of the proposed methods is demonstrated with the development of a transcription system for the recognition of German broadcast news.
Keywords :
Adaptation model; Biomedical monitoring; Irrigation; Markov processes; Optimization; Speech; Temperature sensors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5743822
Filename :
5743822
Link To Document :
بازگشت