DocumentCode :
591909
Title :
Transcription of multi-genre media archives using out-of-domain data
Author :
Bell, Patrick J. ; Gales, Mark J.F. ; Lanchantin, Pierre ; Liu, Xindong ; Long, Yan ; Renals, Steve ; Swietojanski, Pawel ; Woodland, Philip C.
Author_Institution :
Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK
fYear :
2012
fDate :
2-5 Dec. 2012
Firstpage :
324
Lastpage :
329
Abstract :
We describe our work on developing a speech recognition system for multi-genre media archives. The high diversity of the data makes this a challenging recognition task, which may benefit from systems trained on a combination of in-domain and out-of-domain data. Working with tandem HMMs, we present Multi-level Adaptive Networks (MLAN), a novel technique for incorporating information from out-of-domain posterior features using deep neural networks. We show that it provides a substantial reduction in WER over other systems, with relative WER reductions of 15% over a PLP baseline, 9% over in-domain tandem features and 8% over the best out-of-domain tandem features.
Keywords :
hidden Markov models; information retrieval systems; records management; speech recognition; MLAN; deep neural networks; hidden Markov model; in-domain tandem features; multigenre media archives transcription; multilevel adaptive networks; out-of-domain posterior features; relative WER reductions; speech recognition system; tandem HMM; Acoustics; Adaptation models; Hidden Markov models; Neural networks; Speech; Training; Training data; cross-domain adaptation; media archives; speech recognition; tandem;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2012 IEEE
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4673-5125-6
Electronic_ISBN :
978-1-4673-5124-9
Type :
conf
DOI :
10.1109/SLT.2012.6424244
Filename :
6424244
Link To Document :
بازگشت