DocumentCode :
134282
Title :
Unsupervised acoustic model training for the Korean language
Author :
Laurent, A. ; Hartmann, W. ; Lamel, Lori
Author_Institution :
Spoken Language Process. Group, LIMSI, Orsay, France
fYear :
2014
fDate :
12-14 Sept. 2014
Firstpage :
469
Lastpage :
473
Abstract :
This paper investigates unsupervised training strategies for the Korean language in the context of the DGA RAPID Rapmat project. As with previous studies, we begin with only a small amount of manually transcribed data to build preliminary acoustic models. Using the initial models, a larger set of untranscribed audio data is decoded to produce approximate transcripts. We compare both GMM and DNN acoustic models for both the unsupervised transcription and the final recognition system. While the DNN acoustic models produce a lower word error rate on the test set, training on the transcripts from the GMM system provides the best overall performance. We also achieve better performance by expanding the original phone set. Finally, we examine the efficacy of automatically building a test set by comparing system performance both before and after manually correcting the test set.
Keywords :
acoustic signal processing; natural language processing; speech recognition; unsupervised learning; DGA RAPID Rapmat project; DNN acoustic model; GMM acoustic model; Korean language; acoustic models; approximate transcripts; manually transcribed data; phone set; recognition system; system performance analysis; unsupervised acoustic model training; unsupervised training strategies; untranscribed audio data decoding; word error rate; Acoustics; Data models; Hidden Markov models; Speech; Speech recognition; Training; Vocabulary; korean; speech recognition; under-resourced language; unsupervised training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location :
Singapore
Type :
conf
DOI :
10.1109/ISCSLP.2014.6936675
Filename :
6936675
Link To Document :
بازگشت