Title :
Bag of n-gram driven decoding for LVCSR system harnessing
Author :
Bougares, Fethi ; Estève, Yannick ; Deléglise, Paul ; Linarès, Georges
Author_Institution :
LIUM, Univ. of le Mans, Le Mans, France
Abstract :
This paper focuses on automatic speech recognition systems combination based on driven decoding paradigms. The driven decoding algorithm (DDA) involves the use of a 1-best hypothesis provided by an auxiliary system as another knowledge source in the search algorithm of a primary system. In previous studies, it was shown that DDA outperforms ROVER when the primary system is guided by a more accurate system. In this paper we propose a new method to manage auxiliary transcriptions which are presented as a bag-of-n-grams (BONG) without temporal matching. These modifications allow to make easier the combination of several hypotheses given by different auxiliary systems. Using BONG combination with hypotheses provided by two auxiliary systems, each of which obtained more than 23% of WER on the same data, our experiments show that a CMU Sphinx based ASR system can reduce its WER from 19.85% to 18.66% which is better than the results reached with DDA or classical ROVER combination.
Keywords :
decoding; search problems; speech coding; speech recognition; 1-best hypothesis; BONG combination; CMU Sphinx based ASR system; LVCSR system harnessing; automatic speech recognition system; auxiliary system; auxiliary transcription; bag-of-n-grams; driven decoding algorithm; driven decoding paradigm; knowledge source; search algorithm; Acoustics; Adaptation models; Decoding; Error analysis; Hidden Markov models; Speech; Speech recognition; bag of n-gram driven decoding; speech recognition; system combination;
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
DOI :
10.1109/ASRU.2011.6163944