Title :
Soft-clustering technique for training data in Age-and gender-independent speech recognition
Author :
Enami, D. ; Faqiang Zhu ; Yamamoto, Koji ; Nakagawa, Sachiko
Author_Institution :
Dept. of Comput. Sci. & Eng., Toyohashi Univ. of Technol., Toyohashi, Japan
Abstract :
In this paper, we propose approaches for the Gaussian mixture model (GMM) based soft clustering of training data and the GMM- or/and hidden Markov model (HMM)-based cluster selection in age and gender-independent speech recognition. Typically, increasing the number of speaker classes leads to more specific models in speaker-class-dependent speech recognition, and thus better recognition performance. However, the amount of data for each class model is reduced by the increase in the number of classes, which leads to unreliable model parameters. To solve the problem of the reduction of training data, we propose a GMM-based soft clustering method that allows overlap, and a selecting method for selecting a speaker model using a GMM or/and HMM. In an experiment, we obtained a 5.0% absolute gain for word error rate (WER), and a 24.9% gain for the relative WER over an age- and gender-dependent baseline.
Keywords :
Gaussian processes; hidden Markov models; learning (artificial intelligence); speech recognition; GMM; Gaussian mixture model; HMM-based cluster selection; WER; age-independent speech recognition; gender-independent speech recognition; hidden Markov model; soft clustering; soft-clustering technique; speaker model; speaker-class-dependent speech recognition; training data reduction; word error rate; Adaptation models; Context modeling; Educational institutions; Hidden Markov models; Lead; Training;
Conference_Titel :
Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific
Conference_Location :
Hollywood, CA
Print_ISBN :
978-1-4673-4863-8