DocumentCode
1954721
Title
A parallel implementation of Viterbi training for acoustic models using graphics processing units
Author
Buthpitiya, Senaka ; Lane, Ian ; Chong, Jike
Author_Institution
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear
2012
fDate
13-14 May 2012
Firstpage
1
Lastpage
10
Abstract
Robust and accurate speech recognition systems can only be realized with adequately trained acoustic models. For common languages, state-of-the-art systems are trained on many thousands of hours of speech data and even with large clusters of machines the entire training process can take many weeks. To overcome this development bottleneck, we propose a parallel implementation of Viterbi training optimized for training Hidden-Markov-Model (HMM)-based acoustic models using highly parallel graphics processing units (GPUs). In this paper, we introduce Viterbi training, illustrate its application concurrency characteristics, data working set sizes, and describe the optimizations required for effective throughput on GPU processors. We demonstrate that the acoustic model training process is well-suited for GPUs. Using a single NVIDIA GTX580 GPU our proposed approach is shown to be 94.8× faster than a sequential CPU implementation, enabling a moderately sized acoustic model to be trained on 1000 hours of speech data in under 7 hours. Moreover, we show that our implementation on a two-GPU system can perform 3.3× faster than a standard parallel reference implementation on a high-end 32-core Xeon server at 1/15th the cost. Our GPU-based training platform empowers research groups to rapidly evaluate new ideas and build accurate and robust acoustic models on very large training corpora at nominal cost.
Keywords
acoustic signal processing; computer based training; graphics processing units; hidden Markov models; speech recognition; GPU-based training platform; HMM; NVIDIA GTX580 GPU; Viterbi training; acoustic model training process; application concurrency characteristics; data working set sizes; effective throughput; graphics processing units; hidden-Markov-model-based acoustic models; parallel implementation; speech recognition systems; Acoustics; Computational modeling; Concurrent computing; Graphics processing unit; Hidden Markov models; Instruction sets; Training; Acoustic Model Training; Continuous Speech Recognition; Graphics Processing Unit;
fLanguage
English
Publisher
ieee
Conference_Titel
Innovative Parallel Computing (InPar), 2012
Conference_Location
San Jose, CA
Print_ISBN
978-1-4673-2632-2
Electronic_ISBN
978-1-4673-2631-5
Type
conf
DOI
10.1109/InPar.2012.6339590
Filename
6339590
Link To Document