DocumentCode
2769531
Title
Discriminative training of multi-state barge-in models
Author
Ljolje, Andrej ; Goffin, Vincent
Author_Institution
AT&T Labs -Res., Florham Park
fYear
2007
fDate
9-13 Dec. 2007
Firstpage
353
Lastpage
358
Abstract
A barge-in system designed to reflect the design of the acoustic model used in commercial applications has been built and evaluated. It uses standard hidden Markov model structures, cepstral features and multiple hidden Markov models for both the speech and non-speech parts of the model. It is tested on a large number of real-world databases using noisy speech onset positions which were determined by forced alignment of lexical transcriptions with the recognition model. The ML trained model achieves low false rejection rates at the expense of high false acceptance rates. The discriminative training using the modified algorithm based on the maximum mutual information criterion reduces the false acceptance rates by a half, while preserving the low false rejection rates. Combining an energy based voice activity detector with the hidden Markov model based barge-in models achieves the best performance.
Keywords
database management systems; hidden Markov models; speech recognition; acoustic model; discriminative training; hidden Markov model structure; multistate barge-in model; real-world database; speech recognition; Acoustic applications; Acoustic signal detection; Automatic speech recognition; Delay; Electrical capacitance tomography; Face detection; Hidden Markov models; Speech processing; Speech recognition; Speech synthesis; VAD; acoustic modeling; barge-in; dialog systems; speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Conference_Location
Kyoto
Print_ISBN
978-1-4244-1746-9
Electronic_ISBN
978-1-4244-1746-9
Type
conf
DOI
10.1109/ASRU.2007.4430137
Filename
4430137
Link To Document