مرکز منطقه ای اطلاع رساني علوم و فناوري - WFST-based structural classification integrating dnn acoustic features and RNN language features for speech recognition

DocumentCode :

3431534

Title :

WFST-based structural classification integrating dnn acoustic features and RNN language features for speech recognition

Author :

Quoc Truong Do ; Nakamura, Satoshi ; Delcroix, Marc ; Hori, Takaaki

Author_Institution :

Nara Inst. of Sci. & Technol., Nara, Japan

fYear :

2015

fDate :

19-24 April 2015

Firstpage :

4959

Lastpage :

4963

Abstract :

This paper proposes a method to train Weighted Finite State Transducer (WFST) based structural classifiers using deep neural network (DNN) acoustic features and recurrent neural network (RNN) language features for speech recognition. Structural classification is an effective approach to achieve highly accurate recognition of structured data in which the classifier is optimized to maximize the discriminative performance using different kinds of features. A WFST-based classifier, which can integrate acoustic, pronunciation, and language features embedded in a composed WFST, was recently extended to incorporate DNN bottleneck (DNNBN) features. In this paper, we further investigate the integration of a RNN language model (RNNLM) with the WFST classifier. To this end, we introduce a lattice rescoring method using a RNNLM for efficient classifier training. In a lecture transcription task, we reduced the word error rate from 19.2% to 18.6% by optimizing the WFST parameters for the DNNBN acoustic and RNNLM language features.

Keywords :

natural language processing; neural nets; speech recognition; DNN; RNN language features; WFST; acoustic features; data structure; deep neural network; language features; pronunciation features; recurrent neural network; speech recognition; structural classification integrating DNN acoustic features; train weighted finite state transducer; Acoustics; Decoding; Feature extraction; History; Lattices; Speech recognition; Training; Lattice rescoring; RNNLM; Speech recognition; Structural classification; WFST-DNN;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location :

South Brisbane, QLD

Type :

conf

DOI :

10.1109/ICASSP.2015.7178914

Filename :

7178914

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3431534