• DocumentCode
    2286819
  • Title

    Applying N-best keyword search to continuous speech recognition for telecommunication-based applications

  • Author

    Feng, Ming-Whei

  • Author_Institution
    GTE Labs. Inc., Waltham, MA, USA
  • fYear
    1994
  • fDate
    13-16 Apr 1994
  • Firstpage
    726
  • Abstract
    An N-best keyword search algorithm was developed in a continuous speech recognizer which models vocabulary words as well as extraneous sounds and noise, to achieve high sentence accuracy. The continuous speech recognizer was developed for telecommunication-based applications which typically demand high sentence accuracy. Possible approaches for achieving high sentence accuracy include applying complicated speech modeling techniques or employing more knowledge sources when conducting the recognition search. An alternative solution is to first apply an N-best decoding search to obtain N sentence hypotheses using pre-selected knowledge source(s) and then re-score those hypotheses using other knowledge source(s) or models. The proposed N-best keyword search algorithm derives all keyword sentence hypotheses and the corresponding likelihood scores time-synchronously. We show that the algorithm guarantees to find all sentence hypotheses. To reduce the exponentially growing number of hypotheses, in practical implementation we applied empirically derived thresholds to prune the search. Recognition experiments were conducted on two speech corpora: TI Connected Digit Corpus and Road Rally Corpus, to show the effectiveness of the proposed method
  • Keywords
    decoding; speech analysis and processing; speech coding; speech recognition; vocabulary; N-best decoding search; N-best keyword search; Road Rally Corpus; TI Connected Digit Corpus; continuous speech recognition; empirically derived thresholds; knowledge sources; likelihood scores; recognition search; search algorithm; sentence accuracy; speech modeling techniques; telecommunication-based applications; vocabulary words; Acoustic waves; Hidden Markov models; Keyword search; Laboratories; Maximum likelihood decoding; Probability; Speech enhancement; Speech recognition; Viterbi algorithm; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Speech, Image Processing and Neural Networks, 1994. Proceedings, ISSIPNN '94., 1994 International Symposium on
  • Print_ISBN
    0-7803-1865-X
  • Type

    conf

  • DOI
    10.1109/SIPNN.1994.344809
  • Filename
    344809