DocumentCode :
3716068
Title :
A useful feature-engineering approach for a LVCSR system based on CD-DNN-HMM algorithm
Author :
Sung Joo Lee;Byung Ok Kang;Hoon Chung;Jeon Gue Park
Author_Institution :
Speech Processing Lab., Electronics and Telecommunication Research Institute (ETRI), Address: 161 Gajeong-dong, Yuseong-gu, Daejeon, 305-350 South Korea
fYear :
2015
Firstpage :
1421
Lastpage :
1425
Abstract :
In this paper, we propose a useful feature-engineering approach for Context-Dependent Deep-Neural-Network Hidden-Markov-Model (CD-DNN-HMM) based Large-Vocabulary-Continuous-Speech-Recognition (LVCSR) systems. The speech recognition performance of a LVCSR system is improved from two feature-engineering perspectives. The first performance improvement is achieved by adopting the intra/inter-frame feature subsets when the Gaussian-Mixture-Model (GMM) HMMs for the HMM state-level alignment are built. And the second performance gain is then followed with the additional features augmenting the front-end of the DNN. We evaluate the effectiveness of our feature-engineering approach under a series of Korean speech recognition tasks (isolated single-syllable recognition with a medium-sized speech corpus and conversational speech recognition with a large-sized database) using the Kaldi speech recognition toolkit. The results show that the proposed feature-engineering approach outperforms the traditional Mel Frequency Cepstral Coefficient (MFCCs) GMM + Mel-frequency filter-bank output DNN method.
Keywords :
"Speech recognition","Speech","Feature extraction","Hidden Markov models","Entropy","Harmonic analysis","Acoustics"
Publisher :
ieee
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2015 23rd European
Electronic_ISBN :
2076-1465
Type :
conf
DOI :
10.1109/EUSIPCO.2015.7362618
Filename :
7362618
Link To Document :
بازگشت