Title :
Data collection and evaluation of AURORA-2 Japanese corpus [speech recognition applications]
Author :
Nakamura, Satoshi ; Yamamoto, Kazumasa ; Takeda, Kazuya ; Kuroiwa, Shingo ; Kitaoka, Norihide ; Yamada, Takeshi ; Mizumachi, Mitsunori ; Nishiura, Takanobu ; Fujimoto, Masakiyo ; Saso, A. ; Endo, Toshiki
fDate :
30 Nov.-3 Dec. 2003
Abstract :
Speech recognition systems must still be improved when they are exposed to noisy environments. For this improvement, developments of the standard evaluation corpus and assessment technologies are essential. Recently, the AURORA-2,3 corpus and their evaluation scenarios have had significant impact on noisy speech recognition research. This paper introduces a Japanese noisy speech corpus and its evaluation scripts, called AURORA-2J The AURORA-2J is a Japanese connected digits corpus. The data collection and evaluation scenarios are designed in the same way as AURORA-2 with the help of the ETSI AURORA group. Furthermore, we have collected an in-car speech corpus similar to AURORA-3. The in-car speech corpus includes Japanese connected digits and command words collected in a moving car. This paper describes the data collection, baseline scripts, and its baseline performance.
Keywords :
natural languages; speech recognition; AURORA-2J Japanese connected digits corpus; AURORA-3; command words; data collection; evaluation corpus; evaluation scripts; in-car speech corpus; moving car collected speech; noisy speech corpus; noisy speech recognition; Acoustic noise; Additive noise; Large-scale systems; Natural languages; Noise robustness; Speech analysis; Speech recognition; Standards development; Telecommunication standards; Working environment noise;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318511