Title :
Convolutional maxout neural networks for low-resource speech recognition
Author :
Meng Cai ; Yongzhe Shi ; Jian Kang ; Jia Liu ; Tengrong Su
Author_Institution :
Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
Abstract :
Building speech recognition systems with limited data resources is a fast progressing topic. In this paper, we propose the convolutional maxout neural network acoustic model for low-resource speech recognition. There are three motivations for this model. The first is to make use of the prior knowledge of local speech spectrum features by applying the convolutional structures. The second is to shrink the model size and enable better optimization performance by using the maxout nonlinearity. The third is to enhance model generalization and control overfitting by applying the dropout training. All the three motivations compensate for the lack of training data. Experiments on a 24-hour subset of the Switchboard corpus show that the convolutional structure, the maxout nonlinearity and the dropout training all bring superior performances on this task, and the combination of the three technologies achieves over 10.0% relative improvements over a convolutional neural network baseline.
Keywords :
acoustic signal processing; neural nets; speech recognition; control overfitting enhancement; convolutional maxout neural network acoustic model; convolutional structures; dropout training; local speech spectrum features; low-resource speech recognition; maxout nonlinearity; model generalization enhancement; switchboard corpus; Acoustics; Biological neural networks; Neurons; Speech; Speech recognition; Training; Convolutional Neural Networks; Low-Resource; Maxout Networks; Speech Recognition;
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location :
Singapore
DOI :
10.1109/ISCSLP.2014.6936676