DocumentCode :
3661595
Title :
VTLN Based Approaches for Speech Recognition with Very Limited Training Speakers
Author :
Sung Min Ban;Bo Kyung Choi;Young Ho Choi;Hyung Soon Kim
Author_Institution :
Dept. of Electron. Eng., Pusan Nat. Univ., Busan, South Korea
fYear :
2014
Firstpage :
285
Lastpage :
288
Abstract :
In this paper, two approaches using vocal tract length normalization (VTLN) are examined to deal with the acoustic mismatch due to different speakers in automatic speech recognition for the special case that training data is available only for a small number of speakers. One is the conventional VTLN approach in which both training and test utterances are frequency warped according to the maximum likelihood (ML) based warping factor estimation scheme, in order to normalize the speaker characteristics. The other approach is to build a virtually speaker-independent (SI) acoustic model using artificially generated multiple speaker data by VTLN based frequency warping of training utterances from the limited speakers. To compare the performance of the two approaches, Korean isolated word recognition experiments are performed with a small amount of training data from limited speakers. The experimental results show that the virtually SI acoustic model approach yields better performance than both the conventional VTLN approach and the baseline system in case of very limited training speakers.
Keywords :
"Acoustics","Silicon","Speech recognition","Mathematical model","Hidden Markov models","Speech","Training"
Publisher :
ieee
Conference_Titel :
Intelligent Systems, Modelling and Simulation (ISMS), 2014 5th International Conference on
ISSN :
2166-0662
Type :
conf
DOI :
10.1109/ISMS.2014.55
Filename :
7280922
Link To Document :
بازگشت