DocumentCode :
178733
Title :
Submodular subset selection for large-scale speech training data
Author :
Kai Wei ; Yuzong Liu ; Kirchhoff, Katrin ; Bartels, Christopher ; Bilmes, Jeff
Author_Institution :
Dept. of Electr. Eng., Univ. of Washington, Seattle, WA, USA
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
3311
Lastpage :
3315
Abstract :
We address the problem of subselecting a large set of acoustic data to train automatic speech recognition (ASR) systems. To this end, we apply a novel data selection technique based on constrained submodular function maximization. Though NP-hard, the combinatorial optimization problem can be approximately solved by a simple and scalable greedy algorithm with constant-factor guarantees. We evaluate our approach by subselecting data from 1300 hours of conversational English telephone data to train two types large-vocabulary speech recognizers, one with Gaussian mixture model (GMM) based acoustic models, and another based on deep neural networks (DNNs). We show that training data can be reduced significantly, and that our technique outperforms both random selection and a previously proposed selection method utilizing comparable resources. Notably, using the submodular selection method, the DNN system using only about 5% of the training data is able to achieve performance on par with the GMM system using 100% of the training data - with the baseline subset selection methods, however, the DNN system is unable to accomplish this correspondence.
Keywords :
Gaussian processes; combinatorial mathematics; neural nets; optimisation; speech recognition; ASR systems; DNN; GMM based acoustic models; Gaussian mixture model; NP-hard problem; automatic speech recognition; combinatorial optimization problem; constant-factor guarantees; constrained submodular function maximization; data selection technique; deep neural networks; large-scale speech training data; large-vocabulary speech recognizers; submodular subset selection method; Acoustics; Hidden Markov models; Speech; Speech processing; Speech recognition; Training; Training data; automatic speech recognition; large-scale systems; machine learning; speech processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6854213
Filename :
6854213
Link To Document :
بازگشت