مرکز منطقه ای اطلاع رساني علوم و فناوري - Improving bottleneck features for automatic speech recognition using gammatone-based cochleagram and sparsity regularization

DocumentCode :

3752152

Title :

Improving bottleneck features for automatic speech recognition using gammatone-based cochleagram and sparsity regularization

Author :

Chao Ma;Jun Qi;Dongmei Li;Runsheng Liu

Author_Institution :

Department of Electronic Engineering, Tsinghua University, Beijing, China, 100084

fYear :

2015

Firstpage :

Lastpage :

Abstract :

Bottleneck (BN) features, particularly based on deep structures of a neural network, have been successfully applied to Automatic Speech Recognition (ASR) tasks. This paper goes on the study of improving the BN features for ASR tasks by employing two different methods: (1) a Cochleagram generated by Gammatone filters as the input feature for a deep neural network; (2) imposing the sparsity regularization on the bottleneck layer to control the sparsity level of BN features by constraining the activations of the hidden units to be averagely inactive most of the time. Our experiments on the Wall Street Journal (WSJ) database demonstrate that the two approaches can deliver certain performance gains to BN features for ASR tasks. In addition, further experiments on the WSJ database from different noise levels show that the Cochleagram as input has better noise-robust performance than the commonly used Mel-scaled filterbank.

Keywords :

"Neural networks","Frequency modulation","Indexes","Cost function","Automatic speech recognition","Mel frequency cepstral coefficient"

Publisher :

ieee

Conference_Titel :

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015 Asia-Pacific

Type :

conf

DOI :

10.1109/APSIPA.2015.7415401

Filename :

7415401

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3752152