• DocumentCode
    119763
  • Title

    Recent advances in the statistical modeling of the Slovak language

  • Author

    Stas, Jan ; Hladek, Daniel ; Juhar, Jozef

  • Author_Institution
    Dept. of Electron. & Multimedia Commun., Tech. Univ. of Kosice, Košice, Slovakia
  • fYear
    2014
  • fDate
    10-12 Sept. 2014
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    In this paper we aim to describe recent advances in the statistical modeling of the Slovak language for transcription of dictated, semi-spontaneous and spontaneous conversational speech such as judicial readings, broadcast news TV and radio shows, parliament proceedings, educational talks and lectures, or interactive conversations. During the last months, we have improved the efficiency and robustness of the Slovak language models trained on the electronic and web-based language resources, including better text processing and document classification, class-based and filled pauses modeling, augmenting of n-grams and fast language model adaptation. Experimental results performed on the judicial readings, broadcast news recordings and parliament proceeding show significant decrease of the word error rate for multiple Slovak transcription system configurations of acoustic and language models in presented scenarios.
  • Keywords
    Internet; natural language processing; pattern classification; speech recognition; statistical analysis; text analysis; Slovak language model efficiency improvement; Slovak language model robustness improvement; Slovak transcription system configurations; Web-based language resources; acoustic models; broadcast news recordings; class-based modeling; dictated speech transcription; document classification; electronic resources; fast language model adaptation; filled pauses modeling; judicial readings; language models; n-grams augmentation; parliament proceeding; semispontaneous conversational speech transcription; spontaneous conversational speech transcription; statistical modeling; text processing; word error rate reduction; Adaptation models; Computational modeling; Data models; Databases; Hidden Markov models; Speech; Speech recognition; Language Model Adaptation; Language Modeling; Slovak Language; Speech Recognition; Spontaneous Speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    ELMAR (ELMAR), 2014 56th International Symposium
  • Conference_Location
    Zadar
  • Type

    conf

  • DOI
    10.1109/ELMAR.2014.6923310
  • Filename
    6923310