Title :
Pervasive unsupervised adaptation for lecture speech transcription
Author :
Willettt, D. ; Niesler, Thomas ; McDermott, Erik ; Minami, Yasuhiro ; Katagiri, Shigeru
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
Abstract :
Unsupervised adaptation has evolved as a popular approach for tuning the acoustic models of speaker-independent speech recognition systems to specific speakers, speaker groups or channel conditions while making use of only untranscribed data. This study focuses on procedures for unsupervised adaptation of other probabilistic models that are involved in state-of-the-art speech recognizers and on the joint adaptation of multiple knowledge sources. In particular, we outline and evaluate approaches for adapting both the language model and the pronunciation model (lexicon) without supervision. Initial experiments on off-line lecture speech transcription achieved small but promising word error rate improvements with each approach applied separately. The experimental results on the joint application of acoustic, language and pronunciation model adaptation indicate that the individually achievable performance improvements are additive.
Keywords :
linguistics; speech recognition; unsupervised learning; acoustic model adaptation; language model; language model adaptation; lecture speech transcription; lexicon; multiple knowledge sources; pervasive unsupervised adaptation; probabilistic models; pronunciation model; pronunciation model adaptation; state-of-the-art speech recognizers; word error rate; Acoustic applications; Adaptation model; Africa; Error analysis; Laboratories; Loudspeakers; Maximum likelihood decoding; Natural languages; Speech recognition; Training data;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
Print_ISBN :
0-7803-7663-3
DOI :
10.1109/ICASSP.2003.1198775