DocumentCode
178733
Title
Submodular subset selection for large-scale speech training data
Author
Kai Wei ; Yuzong Liu ; Kirchhoff, Katrin ; Bartels, Christopher ; Bilmes, Jeff
Author_Institution
Dept. of Electr. Eng., Univ. of Washington, Seattle, WA, USA
fYear
2014
fDate
4-9 May 2014
Firstpage
3311
Lastpage
3315
Abstract
We address the problem of subselecting a large set of acoustic data to train automatic speech recognition (ASR) systems. To this end, we apply a novel data selection technique based on constrained submodular function maximization. Though NP-hard, the combinatorial optimization problem can be approximately solved by a simple and scalable greedy algorithm with constant-factor guarantees. We evaluate our approach by subselecting data from 1300 hours of conversational English telephone data to train two types large-vocabulary speech recognizers, one with Gaussian mixture model (GMM) based acoustic models, and another based on deep neural networks (DNNs). We show that training data can be reduced significantly, and that our technique outperforms both random selection and a previously proposed selection method utilizing comparable resources. Notably, using the submodular selection method, the DNN system using only about 5% of the training data is able to achieve performance on par with the GMM system using 100% of the training data - with the baseline subset selection methods, however, the DNN system is unable to accomplish this correspondence.
Keywords
Gaussian processes; combinatorial mathematics; neural nets; optimisation; speech recognition; ASR systems; DNN; GMM based acoustic models; Gaussian mixture model; NP-hard problem; automatic speech recognition; combinatorial optimization problem; constant-factor guarantees; constrained submodular function maximization; data selection technique; deep neural networks; large-scale speech training data; large-vocabulary speech recognizers; submodular subset selection method; Acoustics; Hidden Markov models; Speech; Speech processing; Speech recognition; Training; Training data; automatic speech recognition; large-scale systems; machine learning; speech processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6854213
Filename
6854213
Link To Document