DocumentCode :
3328919
Title :
The 2003 ISL rich transcription system for conversational telephony speech
Author :
Soltau, Hagen ; Yu, Hua ; Metze, Florian ; Fügen, Christian ; Jin, Qin ; Jou, Szu-Chen
Author_Institution :
Interactive Syst. Labs., Karlsruhe, Germany
Volume :
1
fYear :
2004
fDate :
17-21 May 2004
Abstract :
This paper describes the ISL large vocabulary conversational telephony speech recognition system, which was tested in NIST\´s RT-03S ("Switchboard") evaluation. We present our experiments on improving preprocessing, acoustic modelling, and language modelling. The system features phone-dependent semi-tied full covariances, semi-tied clustering of septa-phones, clustering across phones, feature adaptive training, robust estimation of VTLN and MLLR, as well as context-dependent interpolation of language models. We present detailed results for each stage of our multi-pass transcription scheme. System development started with a 1997 SWB system, yielding a word error rate of 35.1% on our internal 1h development set. The final system performed at 21.8%, a 38% relative improvement. The error rate on the RT-03 CTS evaluation set is 23.4%.
Keywords :
covariance analysis; error statistics; feature extraction; interpolation; parameter estimation; pattern clustering; speech processing; speech recognition; telephony; vocabulary; 2003 ISL rich transcription system; MLLR; NIST RT-03S; Switchboard evaluation; VTLN; acoustic modelling; across-phone clustering; context-dependent interpolation; conversational telephony speech; feature adaptive training; language modelling; large vocabulary; multi-pass transcription scheme; phone-dependent semi-tied full covariances; preprocessing; robust estimation; semi-tied septa-phone clustering; speech recognition system; word error rate; Acoustic testing; Error analysis; Interpolation; Maximum likelihood linear regression; NIST; Robustness; Speech recognition; System testing; Telephony; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1326100
Filename :
1326100
Link To Document :
بازگشت