DocumentCode :
1948085
Title :
On statistical machine translation method for lexicon refinement in speech recognition
Author :
Haihua Xu ; Xiong Xiao ; Eng-Siong Chng ; Haizhou Li
Author_Institution :
Temasek Labs., Nanyang Technol. Univ., Singapore, Singapore
fYear :
2015
fDate :
12-15 July 2015
Firstpage :
25
Lastpage :
29
Abstract :
In low resource Automatic Speech Recognition (ASR), one usually resorts to the Statistical Machine Translation (SMT) technique to learn transform rules to refine grapheme lexicon. To do this, we face two challenges. One is to generate grapheme sequences from the training data as the targets, which is paired with the original transcripts to train SMT models; the other is to effectively prune the learned rules from the translation model. In this paper we further this study. First we propose a simple but effective pruning method; second, to see in which case we are able to learn better rules, different setups with various acoustic and language model combinations are investigated; finally, to examine if the rules in different setups are complementary, lexicons generated via different rule tables are merged in ASR experiments. We report a WER reduction of up to 6.2% with the proposed technique.
Keywords :
language translation; learning (artificial intelligence); speech recognition; ASR; SMT; automatic speech recognition; grapheme lexicon refinement; pruning method; statistical machine translation method; Accuracy; Acoustics; Data models; Hidden Markov models; Speech; Speech recognition; Training data; automatic speech recognition; grapheme lexicon; lexicon learning; statistical machine translation; system fusion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on
Conference_Location :
Chengdu
Type :
conf
DOI :
10.1109/ChinaSIP.2015.7230355
Filename :
7230355
Link To Document :
بازگشت