مرکز منطقه ای اطلاع رساني علوم و فناوري - Improved phonotactic language recognition based on RNN feature reconstruction

DocumentCode :

179517

Title :

Improved phonotactic language recognition based on RNN feature reconstruction

Author :

Wei-Wei Liu ; Wei-Qiang Zhang ; Yongzhe Shi ; An Ji ; Jiaming Xu ; Jia Liu

Author_Institution :

Dept. of Electron. Eng., Tsinghua Univ., Beijing, China

fYear :

2014

fDate :

4-9 May 2014

Firstpage :

5322

Lastpage :

5326

Abstract :

Nowadays phone recognition followed by support vector machine (PR-SVM) has been proposed in language recognition tasks and shown encouraging results. However, it still suffers from the problems such as the curse of dimensionality led by the increasing order of the N-gram feature supervector, the fast increasing number of possible parameters because of fast exact match of the phoneme history, etc. These problems hamper the capability of N-gram vector space model (VSM) of handling long-term contexts. In this paper, a recurrent neural networks (RNN) based feature reconstruction (FR) method is presented to compensate for the deficiency of the N-grams feature for phonotactic language recognition in this paper. Experiments are implemented on 2009 National Institute of Standards and Technology language recognition evaluation (NIST LRE) database. The results show that the proposed method gives 8.76%, 3.82%, 11.93% relative error rate reduction for 30s, 10s, 3s respectively comparing with the baseline system.

Keywords :

feature extraction; natural language processing; recurrent neural nets; signal reconstruction; speech recognition; FR method; N-gram feature supervector; N-gram vector space model; NIST LRE database; PR-SVM; RNN; RNN feature reconstruction method; VSM; curse of dimensionality; improved phonotactic language recognition; phone recognition; phoneme history; recurrent neural networks; support vector machine; technology language recognition evaluation database; Context; NIST; Recurrent neural networks; Speech recognition; Support vector machines; Training; Vectors; feature reconstruction (FR); language recognition; recurrent neural networks (RNN);

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location :

Florence

Type :

conf

DOI :

10.1109/ICASSP.2014.6854619

Filename :

6854619

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=179517