Title :
Long-term flexible 2D cepstral modeling of speech spectral amplitudes
Author :
Firouzmand, Mohammad ; Girin, Laurent
Author_Institution :
Grenoble Lab. of Images, Grenoble
fDate :
March 31 2008-April 4 2008
Abstract :
This paper presents a method for modeling the envelope of spectral amplitude parameters of speech signals in "two dimensions" (2D). It consists of two cascaded modelings: the first one along the frequency axis is the usual cepstrum technique, which consists of modeling the log-scaled spectral envelope with a discrete cosine model (DCM). The second one, along the time axis, consists of modeling the trajectory of the envelope DCM coefficients by another similar DCM model. An iterative algorithm is proposed to optimally fit this 2D-model to the data according to a perceptual criterion based on frequency masking. This approach is shown to provide an efficient and flexible representation of spectral amplitude parameters in terms of coefficient rates, while providing good signal quality, opening new perspectives in very-low bit-rate sinusoidal speech coding.
Keywords :
discrete cosine transforms; iterative methods; speech processing; cascaded modelings; discrete cosine model; frequency masking; iterative algorithm; log-scaled spectral envelope; long-term flexible 2D cepstral modeling; perceptual criterion; sinusoidal speech coding; speech signals; speech spectral amplitudes; Cepstral analysis; Speech; speech analysis; speech coding; speech modeling; speech processing; speech synthesis;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4518515