DocumentCode :
3752237
Title :
Two-stage lexicon optimization of G2P-converted pronunciation dictionary based on statistical acoustic confusability measure
Author :
Nam Kyun Kim;Woo Kyeong Seong;Hun Kyu Ha;Hong Kook Kim
Author_Institution :
School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju 61005, Korea
fYear :
2015
Firstpage :
135
Lastpage :
138
Abstract :
In this paper, we propose a two-stage lexicon optimization method based on a statistical acoustic confusability measure to generate an optimized lexicon for automatic speech recognition (ASR). It is usual to build a lexicon by using grapheme-to-phoneme (G2P) conversion. However, G2P is often realized by 1-to-N best mapping, which results in the increase of lexicon size. To mitigate this problem, the proposed method attempts to prune the confusable words in the lexicon by using a confusability measure (CM) defined as an acoustic model (AM) based distance between two pronunciation variants. In particular, the first stage of the proposed method coarsely prunes the lexicon by a CM defined from monophone-based hidden Markov models (HMMs), and the second stage prunes it further by a CM defined from triphone-based HMMs. It is demonstrated from ASR experiments that an ASR system employing the proposed lexicon optimization method achieves a relative word error rate reduction of 18.88% on a task of Wall Street Journal, compared to that using a G2P-converted pronunciation dictionary without any optimization.
Keywords :
"Hidden Markov models","Acoustics","Optimization methods","Dictionaries","Acoustic measurements","Speech recognition"
Publisher :
ieee
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015 Asia-Pacific
Type :
conf
DOI :
10.1109/APSIPA.2015.7415488
Filename :
7415488
Link To Document :
بازگشت