DocumentCode
109571
Title
Adaptation of Morph-Based Speech Recognition for Foreign Names and Acronyms
Author
Mansikkaniemi, Andre ; Kurimo, Mikko
Author_Institution
Dept. of Signal Process. & Acoust., Aalto Univ., Aalto, Finland
Volume
23
Issue
5
fYear
2015
fDate
May-15
Firstpage
941
Lastpage
950
Abstract
In this paper, we improve morph-based speech recognition system by focusing adaptation efforts on acronyms (ACRs) and foreign proper names (FPNs). An unsupervised language model (LM) adaptation framework based on two-pass decoding is used. Vocabulary adaptation is applied alongside unsupervised LM adaptation. The aim is to improve both language and pronunciation modeling for FPNs and ACRs. A smart selection algorithm is used to find the most likely topically related foreign words and acronyms from in-domain text. New pronunciation rules are generated for the selected words. Different kinds of morpheme adaptation operations are also evaluated on the ACR and FPN candidate words, to ensure optimal results are gained from pronunciation adaptation. Statistically significant improvements in average word error rate (WER), and term error rate (TER), are achieved using a combination of unsupervised LM adaptation with vocabulary adaptation focused on ACRs and FPNs.
Keywords
decoding; error statistics; speech recognition; vocabulary; ACR candidate words; FPN candidate words; LM adaptation framework; TER; WER; acronyms; foreign proper names; foreign words; in-domain text; morph-based speech recognition adaptation; pronunciation rules; smart selection algorithm; term error rate; two-pass decoding; unsupervised language model; vocabulary adaptation; word error rate; Adaptation models; Speech; Speech processing; Speech recognition; Terminology; Training; Vocabulary; Foreign word detection; morph-based speech recognition; out-of-vocabulary (OOV) recognition; unsupervised language model (LM) adaptation;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher
ieee
ISSN
2329-9290
Type
jour
DOI
10.1109/TASLP.2015.2414818
Filename
7063956
Link To Document