Title :
Recent advances in the statistical modeling of the Slovak language
Author :
Stas, Jan ; Hladek, Daniel ; Juhar, Jozef
Author_Institution :
Dept. of Electron. & Multimedia Commun., Tech. Univ. of Kosice, Košice, Slovakia
Abstract :
In this paper we aim to describe recent advances in the statistical modeling of the Slovak language for transcription of dictated, semi-spontaneous and spontaneous conversational speech such as judicial readings, broadcast news TV and radio shows, parliament proceedings, educational talks and lectures, or interactive conversations. During the last months, we have improved the efficiency and robustness of the Slovak language models trained on the electronic and web-based language resources, including better text processing and document classification, class-based and filled pauses modeling, augmenting of n-grams and fast language model adaptation. Experimental results performed on the judicial readings, broadcast news recordings and parliament proceeding show significant decrease of the word error rate for multiple Slovak transcription system configurations of acoustic and language models in presented scenarios.
Keywords :
Internet; natural language processing; pattern classification; speech recognition; statistical analysis; text analysis; Slovak language model efficiency improvement; Slovak language model robustness improvement; Slovak transcription system configurations; Web-based language resources; acoustic models; broadcast news recordings; class-based modeling; dictated speech transcription; document classification; electronic resources; fast language model adaptation; filled pauses modeling; judicial readings; language models; n-grams augmentation; parliament proceeding; semispontaneous conversational speech transcription; spontaneous conversational speech transcription; statistical modeling; text processing; word error rate reduction; Adaptation models; Computational modeling; Data models; Databases; Hidden Markov models; Speech; Speech recognition; Language Model Adaptation; Language Modeling; Slovak Language; Speech Recognition; Spontaneous Speech;
Conference_Titel :
ELMAR (ELMAR), 2014 56th International Symposium
Conference_Location :
Zadar
DOI :
10.1109/ELMAR.2014.6923310