• DocumentCode
    713974
  • Title

    Evaluation of advanced language modeling techniques for the Slovak LVCSR

  • Author

    Zlacky, Daniel ; Stas, Jan ; Juhar, Jozef ; Cizmar, Anton

  • Author_Institution
    Dept. of Electron. & Multimedia Commun., Tech. Univ. of Kosice, Kosice, Slovakia
  • fYear
    2015
  • fDate
    21-22 April 2015
  • Firstpage
    195
  • Lastpage
    198
  • Abstract
    In this paper we compare several advanced language modeling techniques for the Slovak continuous speech recognition. Five different language modeling techniques were analyzed, considering their model size and perplexity, speech recognition performance and complexity of their usage in real conditions of speech recognition in Slovak. The preliminary experimental results show that the convenient n-gram models smoothed by the Witten-Bell back-off algorithm produce the best performance according to the model perplexity and recognition accuracy. Other modeling techniques including Maximum Entropy, Power Law Discounting, Hierarchical Pitman-Yor process, or Variable-order Kneser-Ney smoothed models achieved better results only in the model perplexity. However, the increased computational requirements and worse recognition performance limit their usage in the real speech recognition tasks in Slovak.
  • Keywords
    entropy; natural language processing; speech recognition; Slovak continuous speech recognition performance; Witten-Bell back-off algorithm; advanced language modeling techniques; computational requirements; hierarchical Pitman-Yor process; maximum entropy; model perplexity; model size; n-gram models; power law discounting; variable-order Kneser-Ney smoothed model; worse recognition performance limit; Adaptation models; Computational modeling; Data models; Hidden Markov models; Speech; Speech recognition; Training; continuous speech recognition; language model; model perplexity; word error rate;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Radioelektronika (RADIOELEKTRONIKA), 2015 25th International Conference
  • Conference_Location
    Pardubice
  • Print_ISBN
    978-1-4799-8117-5
  • Type

    conf

  • DOI
    10.1109/RADIOELEK.2015.7129007
  • Filename
    7129007