DocumentCode :
284592
Title :
On the use of acoustic-phonetic features in interactive labelling of multi-lingual speech corpora
Author :
Dalsgaard, P. ; Andersen, O. ; Barry, W. ; JØrgensen, R.
Author_Institution :
Speech Technol. Centre, Aalborg Univ., Denmark
Volume :
1
fYear :
1992
fDate :
23-26 Mar 1992
Firstpage :
549
Abstract :
Results are reported from research on the use of continuously valued acoustic-phonetic features in the multi-language label alignment of combined speech corpora from three European languages: Danish, English, and Italian. A self-organizing neural network is used to transform cepstrum coefficients into a set of features, which are subsequently transformed into a set of principal components. These are used to model individual phonemes, which are used in a Viterbi search/level-building process to align an independently given string of phonemes with the corresponding speech signal. The results obtained show an overall accuracy of 55.7% in the positioning of the label boundary transitions in the combined test corpus. Detailed analysis shows that certain sound class boundaries are very accurately positioned. To provide a general solution to the problem of positioning badly positioned boundaries, an interactive component of the alignment system has been developed. First results demonstrate this component to be very valuable in the task of user-assisted boundary positioning
Keywords :
self-organising feature maps; speech recognition; Danish; English; Italian; acoustic-phonetic features; cepstrum coefficients; interactive component; interactive labelling; label boundary transitions; multilingual speech corpora; phonemes; principal components; research; self-organizing neural network; sound class boundaries; speech recognition; speech signal; user-assisted boundary positioning; Cepstrum; Collaboration; Educational institutions; Labeling; Natural languages; Neural networks; Signal processing; Speech processing; System testing; Viterbi algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on
Conference_Location :
San Francisco, CA
ISSN :
1520-6149
Print_ISBN :
0-7803-0532-9
Type :
conf
DOI :
10.1109/ICASSP.1992.225849
Filename :
225849
Link To Document :
بازگشت