DocumentCode :
3326174
Title :
Waveform quantization of speech using Gaussian mixture models
Author :
Samuelsson, Jonas
Author_Institution :
Dept. Signals, Sensors & Syst., R. Inst. of Technol., Stockholm, Sweden
Volume :
1
fYear :
2004
fDate :
17-21 May 2004
Abstract :
Waveform quantization of speech using Gaussian mixture models (GMM) is proposed. GMM are trained directly on the speech waveform, and high dimensional vector quantizers (VQ) that efficiently exploit the redundancy are constructed based on the GMM parameters. Two types of GMM are studied. The complexity of the scheme is independent of the rate, and the rate can be changed without retraining the VQ. A shape-gain structure improves performance and robustness. Pre- and post-processing using spectral amplitude warping further improves perceptual quality. A 32-dimensional VQ operating at 2 bits/sample reproduces speech sampled at 8 kHz with a PESQ score of 4.2.
Keywords :
Gaussian distribution; redundancy; spectral analysis; speech codecs; speech coding; vector quantisation; 8 kHz; GMM training; Gaussian mixture models; VQ; audio codecs; high dimensional vector quantizers; perceptual quality; redundancy; shape-gain structure; spectral amplitude warping; speech waveform quantization; Bandwidth; Codecs; Covariance matrix; Data compression; Design optimization; Quantization; Robustness; Sensor systems; Speech; Surface acoustic waves;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1325948
Filename :
1325948
Link To Document :
بازگشت