مرکز منطقه ای اطلاع رساني علوم و فناوري - On statistical machine translation method for lexicon refinement in speech recognition

DocumentCode :

1948085

Title :

On statistical machine translation method for lexicon refinement in speech recognition

Author :

Haihua Xu ; Xiong Xiao ; Eng-Siong Chng ; Haizhou Li

Author_Institution :

Temasek Labs., Nanyang Technol. Univ., Singapore, Singapore

fYear :

2015

fDate :

12-15 July 2015

Firstpage :

Lastpage :

Abstract :

In low resource Automatic Speech Recognition (ASR), one usually resorts to the Statistical Machine Translation (SMT) technique to learn transform rules to refine grapheme lexicon. To do this, we face two challenges. One is to generate grapheme sequences from the training data as the targets, which is paired with the original transcripts to train SMT models; the other is to effectively prune the learned rules from the translation model. In this paper we further this study. First we propose a simple but effective pruning method; second, to see in which case we are able to learn better rules, different setups with various acoustic and language model combinations are investigated; finally, to examine if the rules in different setups are complementary, lexicons generated via different rule tables are merged in ASR experiments. We report a WER reduction of up to 6.2% with the proposed technique.

Keywords :

language translation; learning (artificial intelligence); speech recognition; ASR; SMT; automatic speech recognition; grapheme lexicon refinement; pruning method; statistical machine translation method; Accuracy; Acoustics; Data models; Hidden Markov models; Speech; Speech recognition; Training data; automatic speech recognition; grapheme lexicon; lexicon learning; statistical machine translation; system fusion;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on

Conference_Location :

Chengdu

Type :

conf

DOI :

10.1109/ChinaSIP.2015.7230355

Filename :

7230355

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1948085