Title :
Progress in the CU-HTK broadcast news transcription system
Author :
Gales, Mark J F ; Kim, Do Yeong ; Woodland, Philip C. ; Chan, Ho Yin ; Mrva, David ; Sinha, Rohit ; Tranter, Sue E.
Author_Institution :
Eng. Dept., Cambridge Univ.
Abstract :
Broadcast news (BN) transcription has been a challenging research area for many years. In the last couple of years, the availability of large amounts of roughly transcribed acoustic training data and advanced model training techniques has offered the opportunity to greatly reduce the error rate on this task. This paper describes the design and performance of BN transcription systems which make use of these developments. First, the effects of using lightly supervised training data and advanced acoustic modeling techniques are discussed. The design of a real-time broadcast news recognition system is then detailed using these new models. As system combination has been found to yield large gains in performance, a range of frameworks that allow multiple recognition outputs to be combined are next described. These include the use of multiple types of acoustic models and multiple segmentations. As a contrast a system developed by multiple sites allowing cross-site combination, the "SuperEARS" system, is also described. The various models and recognition configurations are evaluated using several recent BN development and evaluation test sets. These new BN transcription systems can give gains of over 25% relative to the CU-HTK 2003 BN system
Keywords :
broadcasting; speech recognition; CU-HTK broadcast news transcription system; SuperEARS system; advanced acoustic model training techniques; cross-site combination; error rate reduction; lightly supervised training data; multiple recognition outputs; multiple segmentations; real-time broadcast news recognition system; recognition configurations; roughly transcribed acoustic training data; Acoustic testing; Availability; Broadcasting; Ear; Error analysis; Loudspeakers; Performance gain; Real time systems; Speech recognition; Training data; Automatic speech recognition; broadcast news (BN) transcription; diarization;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2006.878264