DocumentCode
2971950
Title
The ESAT 2008 system for N-Best Dutch speech recognition benchmark
Author
Demuynck, Kris ; Puurula, Antti ; Van Compernolle, Dirk ; Wambacq, Patrick
Author_Institution
Dept. of Electr. Eng., Katholieke Univ. Leuven, Leuven, Belgium
fYear
2009
fDate
Nov. 13 2009-Dec. 17 2009
Firstpage
339
Lastpage
344
Abstract
This paper describes the ESAT 2008 Broadcast News transcription system for the N-Best 2008 benchmark, developed in part for testing the recent SPRAAK Speech Recognition Toolkit. ESAT system was developed for the Southern Dutch Broadcast News subtask of N-Best using standard methods of modern speech recognition. A combination of improvements were made in commonly overlooked areas such as text normalization, pronunciation modeling, lexicon selection and morphological modeling, virtually solving the out-of-vocabulary (OOV) problem for Dutch by reducing OOV-rate to 0.06% on the N-Best development data and 0.23% on the evaluation data. Recognition experiments were run with several configurations comparing one-pass vs. two-pass decoding, high-order vs. low-order n-gram models, lexicon sizes and different types of morphological modeling. The system achieved 7.23% word error rate (WER) on the broadcast news development data and 20.3% on the much more difficult evaluation data of N-Best.
Keywords
broadcasting; natural language processing; speech recognition; text analysis; ESAT 2008 broadcast news transcription system; ESAT 2008 system; N-Best Dutch speech recognition benchmark; SPRAAK speech recognition toolkit; Southern Dutch Broadcast News; broadcast news development data; lexicon selection; morphological modeling; out-of-vocabulary problem; pronunciation modeling; text normalization; word error rate; Benchmark testing; Broadcasting; Decoding; Error analysis; Filtering; Loudspeakers; Speech analysis; Speech recognition; Telephony; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location
Merano
Print_ISBN
978-1-4244-5478-5
Electronic_ISBN
978-1-4244-5479-2
Type
conf
DOI
10.1109/ASRU.2009.5373311
Filename
5373311
Link To Document