• DocumentCode
    2016625
  • Title

    A study of large vocabulary speech recognition decoding using finite-state graphs

  • Author

    Ou, Zhijian ; Xiao, Ji

  • Author_Institution
    Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
  • fYear
    2010
  • fDate
    Nov. 29 2010-Dec. 3 2010
  • Firstpage
    123
  • Lastpage
    128
  • Abstract
    The use of weighted finite-state transducers (WFSTs) has become an attractive technique for building large vocabulary continuous speech recognition decoders. Conventionally, the compiled search network is represented as a standard WFST, which is then directly fed into a Viterbi decoder. In this work, we use the standard WFST representations and operations during compiling the search network. The compiled WFST is then equivalently converted to a new graphical representation, which we call finite-state graph (FSG). The resulting FSG is more tailored to Viterbi decoding for speech recognition and more compact in memory. This paper presents our effort to build a state-of-the-art WFST-based speech recognition system, which we call GrpDecoder. Benchmarking of GrpDecoder is carried out separately on two languages - English and Mandarin. The test results show that GrpDecoder which uses the new FSG representation in searching is superior to HTK´s HDecode and IDIAP´s Juicer for both languages, achieving lower error rates for a given recognition speed.
  • Keywords
    Viterbi decoding; finite state machines; speech recognition; vocabulary; GrpDecoder Benchmarking; Viterbi decoder; WFST representation; compiled search network; finite state graph; graphical representation; large vocabulary speech recognition decoding; Acoustics; Decoding; Hidden Markov models; Memory management; Speech recognition; Transducers; Viterbi algorithm; WFST; finite-state graph; grpdecoder;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
  • Conference_Location
    Tainan
  • Print_ISBN
    978-1-4244-6244-5
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2010.5684837
  • Filename
    5684837