Title :
Evaluation of automatic transcription systems for the judicial domain
Author :
Lööf, J. ; Falavigna, D. ; Schlüter, R. ; Giuliani, D. ; Gretter, R. ; Ney, H.
Author_Institution :
Comput. Sci. Dept., RWTH Aachen Univ., Aachen, Germany
Abstract :
This paper describes two different automatic transcription systems developed for judicial application domains for the Polish and Italian languages. The judicial domain requires to cope with several factors which are known to be critical for automatic speech recognition, such as: background noise, reverberation, spontaneous and accented speech, overlapped speech, cross channel effects, etc. The two automatic speech recognition (ASR) systems have been developed independently starting from out-of-domain data and, then, they have been adapted using a certain amount of in-domain audio and text data. The ASR performance have been measured on audio data acquired in the courtrooms of Naples and Wroclaw. The resulting word error rates are around 40%, for Italian, and around between 30% and 50% for Polish. This performance, similar to that reported for other comparable ASR tasks (e.g. meeting transcriptions with distant microphone), suggests that possible applications can address tasks such as indexing and/or information retrieval in multimedia documents recorded during judicial debates.
Keywords :
natural language processing; speech recognition; Italian languages; Polish languages; automatic speech recognition system; automatic transcription system; judicial application domains; Automatic transcription; cross-channel effects; domain adaptation; judicial domain;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2010 IEEE
Conference_Location :
Berkeley, CA
Print_ISBN :
978-1-4244-7904-7
Electronic_ISBN :
978-1-4244-7902-3
DOI :
10.1109/SLT.2010.5700852