DocumentCode :
323561
Title :
Multilingual phone recognition of spontaneous telephone speech
Author :
Corredor-Ardoy, C. ; Lamel, L. ; Adda-Decker, M. ; Gauvain, J.-L.
Author_Institution :
Lab. d´´Inf. pour la Mecanique et les Sci. de l´´Ingenieur, CNRS, Orsay, France
Volume :
1
fYear :
1998
fDate :
12-15 May 1998
Firstpage :
413
Abstract :
In this paper we report on experiments with phone recognition of spontaneous telephone speech. Phone recognizers were trained and assessed on IDEAL, a multilingual corpus containing telephone speech in French, British English, German and Castillan Spanish. We investigated the influence of the training material composition (size and linguistic content) on the recognition performance using context-independent (CI) hidden Markov models (HMMs) and phonotactic bigram models. We found that when testing on spontaneous speech data, using only spontaneous speech training data gave the highest phone accuracies for the four languages, even though this data comprises only 14% of the available training data. The use of context-dependent (CD) HMMs reduced the phone error across the 4 languages, with the average error reduced to 51.9% from the 57.4% obtained with CI models. We suggest a straightforward way of detecting non speech phenomena. The basic idea is to remove sequences of consonants between two silence labels from the recognized phone strings prior to scoring. This simple technique reduces the relative average phone error rate by 5.4%. The lowest phone error with CD models and filtering was obtained for Spanish (39.1%) with 4 language average being 49.1%
Keywords :
hidden Markov models; speech recognition; telephony; British English; Castillan Spanish; French; German; HMM; IDEAL multilingual corpus; context-dependent hidden Markov models; context-independent hidden Markov models; filtering; linguistic content; multilingual phone recognition; non-speech phenomena detection; phonotactic bigram models; relative average phone error rate; spontaneous telephone speech; training material composition; Composite materials; Context modeling; Error analysis; Filtering; Hidden Markov models; Natural languages; Speech recognition; Telephony; Testing; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
ISSN :
1520-6149
Print_ISBN :
0-7803-4428-6
Type :
conf
DOI :
10.1109/ICASSP.1998.674455
Filename :
674455
Link To Document :
بازگشت