Title :
Fine-tuning HMMS for nonverbal vocalizations in spontaneous speech: A multicorpus perspective
Author :
Prylipko, Dmytro ; Schuller, Björn ; Wendemuth, Andreas
Author_Institution :
Dept. of Electr. Eng. & Inf. Technol., Otto von Guericke Univ. Magdeburg, Magdeburg, Germany
Abstract :
Phenomena like filled pauses, laughter, breathing, hesitation, etc. play significant role in everyday human-to-human conversation and have a significant influence on speech recognition accuracy [1]. Because of their nature (e. g. long duration), they should be modeled with different number of emitting states and Gaussian mixtures. In this paper we address this question and try to determine the most suitable method for finding these parameters: we provide an examination of two methods for optimization of hidden Markov model (HMM) configurations for better classification and recognition of nonverbal vocalizations within speech. Experiments were conducted on three conversational databases: TUM AVIC, Verbmobil, and SmartKom. These experiments show that with HMMs configurations tailored to a particular database we can achieve 1-3% improvement in speech recognition accuracy with comparison to a baseline topology. An in-depth analysis of discussed methods is provided.
Keywords :
Gaussian processes; hidden Markov models; optimisation; speech recognition; Gaussian mixtures; TUM AVIC; baseline topology; fine-tuning HMMS; hidden Markov model configurations; human-to-human conversation; in-depth analysis; multicorpus perspective; nonverbal vocalization classification; nonverbal vocalization recognition; nonverbal vocalizations; optimization; speech recognition accuracy; spontaneous speech; Accuracy; Databases; Hidden Markov models; Noise; Optimization; Speech; Speech recognition; Spontaneous speech; laughter recognition; multiple corpora; nonverbals;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6288949