مرکز منطقه ای اطلاع رساني علوم و فناوري - H- and C-level WFST-based large vocabulary continuous speech recognition on Graphics Processing Units

DocumentCode :

2160851

Title :

H- and C-level WFST-based large vocabulary continuous speech recognition on Graphics Processing Units

Author :

Kim, Jungsuk ; You, Kisun ; Sung, Wonyong

Author_Institution :

Sch. of Electr. Eng., Seoul Nat. Univ., Seoul, South Korea

fYear :

2011

fDate :

22-27 May 2011

Firstpage :

1733

Lastpage :

1736

Abstract :

We have implemented 20,000-word large vocabulary continuous speech recognition (LVCSR) systems employing Hand C-level weighted finite state transducer (WFST) based networks on Graphics Processing Units (GPUs). Both the emission probability computation and the Viterbi beam search are implemented on the GPU in a data-parallel manner to minimize the extra data transfer time between the host CPU and the GPU. This study utilizes word-length optimization techniques to reduce the synchronization overhead in the Viterbi beam search. We achieve 18.6% to 21.9% of speed up by using an efficient data packing method with less than 0.2% accuracy degradation. Furthermore, we explore different levels of abstraction in recognition network generation to reduce the number of synchronization operations as well as to minimize the memory usage. The experimental results show that the implemented systems on the GPU perform speech recognition 4.07 to 4.55 times faster than highly optimized sequential implementations on a CPU.

Keywords :

computer graphic equipment; coprocessors; maximum likelihood estimation; search problems; speech recognition; C-level weighted finite state transducer based networks; CPU; GPU; H-level weighted finite state transducer based networks; Viterbi beam search; data packing method; emission probability computation; graphics processing units; large vocabulary continuous speech recognition systems; recognition network generation; Decoding; Graphics processing unit; Hidden Markov models; History; Instruction sets; Speech recognition; Synchronization; Graphics Processing Unit; Parallelization; Speech recognition; WFSTs; Word-length optimization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location :

Prague

ISSN :

1520-6149

Print_ISBN :

978-1-4577-0538-0

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2011.5946836

Filename :

5946836

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2160851