DocumentCode :
835872
Title :
Low bit-rate voice compression based on frequency domain interpolative techniques
Author :
Bhaskar, Udaya ; Swaminathan, Kumar
Author_Institution :
Hughes Network Syst. Inc., Germantown, MD, USA
Volume :
14
Issue :
2
fYear :
2006
fDate :
3/1/2006 12:00:00 AM
Firstpage :
558
Lastpage :
576
Abstract :
This paper presents an approach, referred to as frequency domain interpolation (FDI), for achieving high-quality speech at low bit-rates (4 kb/s and below) within reasonable complexity and delay. FDI methods, like the prototype waveform interpolation (PWI) methods, derive a prototype waveform (PW) at regular intervals of time. But, unlike PWI, there is no separation into a slowly evolving waveform (SEW) and a rapidly evolving waveform (REW) component. Instead, the PW is encoded after gain normalization in magnitude-phase form. The magnitude is modeled as a sum of mean and deviation values in multiple frequency bands and this model is quantized using switched backward adaptive VQ techniques. The phase information is represented as a composite vector of PW correlations in multiple frequency bands and an overall voicing measure. This information is quantized using a VQ at the encoder. At the decoder, a phase model is employed that uses the received phase (and magnitude) information to reproduce PWs with the correct periodicity and evolutionary characteristics. Speech is synthesized by interpolating the reconstructed PWs after gain adjustment and filtering it using the short-term predictor and a postfilter. The design of a 4-kb/s and a 2.4-kb/s FDI codec are presented in this paper and their performance is characterized in terms of delay, complexity, and subjective voice quality. The results confirm that FDI techniques have the potential for delivering high-quality speech at low bit-rates in a cost-effective manner.
Keywords :
decoding; frequency-domain analysis; interpolation; speech codecs; speech coding; vector quantisation; vocoders; decoder; frequency domain interpolative techniques; low bit-rate voice compression; switched backward adaptive VQ; vector quantization; voice codec; Decoding; Delay; Fault detection; Filtering; Frequency domain analysis; Frequency measurement; Interpolation; Phase measurement; Prototypes; Speech synthesis; Frequency domain interpolation (FDI); linear prediction; prototype waveform interpolation; voice coding;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TSA.2005.857803
Filename :
1597260
Link To Document :
بازگشت