DocumentCode :
734996
Title :
Improving HMM/DNN in ASR of under-resourced languages using probabilistic sampling
Author :
Meixu Song ; Qingqing Zhang ; Jielin Pan ; Yonghong Yan
Author_Institution :
Key Lab. of Speech Acoust. & Content Understanding, Beijing, China
fYear :
2015
fDate :
12-15 July 2015
Firstpage :
20
Lastpage :
24
Abstract :
In HMM/DNN automatic speech recognition (ASR) systems, the DNNs model the posterior probabilities for triphone states. However, triphone states are unevenly distributed. In this situation, the training algorithm tends to converge to a local optimum more related to states with rich data than states with poor data. Thus, the imbalance of the training data decreases the ASR performances, especially for under-resourced languages. To deal with this issue, we explore a resampling technique, called “probabilistic sampling”, which can be seen as a linear smoothing between the original sampling and the uniform sampling. The effectiveness of the probabilistic sampling has been studied in two under-resourced ASR experiments. With the probabilistic sampling, the first experiment got a 6.3% relative phone error rate (PER) reduction compared to the conventional DNN baseline; the second experiment used shared-hidden-layer multilingual DNN as the baseline, and obtained a 4.9% relative PER reduction.
Keywords :
hidden Markov models; learning (artificial intelligence); neural nets; probability; signal sampling; smoothing methods; speech recognition; ASR system; HMM-DNN automatic speech recognition system; PER reduction; deep neural network; linear smoothing; phone error rate; posterior probability; probabilistic sampling; resampling technique; shared-hidden-layer multilingual DNN; training algorithm; training data imbalance; triphone state; underresourced language; uniform sampling; Acoustics; Hidden Markov models; Probabilistic logic; Speech; Speech recognition; Training; Training data; Automatic speech recognition; HM-M/DNN hybrid; probabilistic sampling; under-resourced languages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on
Conference_Location :
Chengdu
Type :
conf
DOI :
10.1109/ChinaSIP.2015.7230354
Filename :
7230354
Link To Document :
بازگشت