Title :
On-line speaking rate estimation using Gaussian mixture models
Author :
Faltlhauser, R. ; Pfau, T. ; Ruske, G.
Author_Institution :
Inst. for Human-Machine-Commun., Tech. Univ. of Munich, Germany
Abstract :
Gaussian mixture models (GMM) are a widespread tool in applications like speaker identification or verification. In contrast to hidden Markov models (HMM) Gaussian mixture models are designed to model the general properties of an underlying acoustic source. In our paper we extend the application of GMMs to the assessment of speaking rate. Directly trained on the acoustic data, they can be either applied directly to estimate the speech rate category or-with the help of a mapping function-they can provide a continuous measure for the speaking rate. The mapping function can be realized by means of a neural net. First experiments showed a correlation coefficient of 0.66 between the lexical phoneme rate and our estimation based on speech rate dependent spectral variation. Moreover, our approach can be used simultaneously for high accuracy on-line gender detection
Keywords :
Gaussian processes; neural nets; parameter estimation; speaker recognition; spectral analysis; Gaussian mixture models; correlation coefficient; high accuracy on-line gender detection; lexical phoneme rate; mapping function; neural net; on-line speaking rate estimation; speaker identification; speaker verification; speech rate category; speech rate dependent spectral variation; underlying acoustic source; Acoustic measurements; Energy measurement; Extraterrestrial measurements; Hidden Markov models; Loudspeakers; Multilayer perceptrons; Neural networks; Speech; Training data;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.861830