DocumentCode
835872
Title
Low bit-rate voice compression based on frequency domain interpolative techniques
Author
Bhaskar, Udaya ; Swaminathan, Kumar
Author_Institution
Hughes Network Syst. Inc., Germantown, MD, USA
Volume
14
Issue
2
fYear
2006
fDate
3/1/2006 12:00:00 AM
Firstpage
558
Lastpage
576
Abstract
This paper presents an approach, referred to as frequency domain interpolation (FDI), for achieving high-quality speech at low bit-rates (4 kb/s and below) within reasonable complexity and delay. FDI methods, like the prototype waveform interpolation (PWI) methods, derive a prototype waveform (PW) at regular intervals of time. But, unlike PWI, there is no separation into a slowly evolving waveform (SEW) and a rapidly evolving waveform (REW) component. Instead, the PW is encoded after gain normalization in magnitude-phase form. The magnitude is modeled as a sum of mean and deviation values in multiple frequency bands and this model is quantized using switched backward adaptive VQ techniques. The phase information is represented as a composite vector of PW correlations in multiple frequency bands and an overall voicing measure. This information is quantized using a VQ at the encoder. At the decoder, a phase model is employed that uses the received phase (and magnitude) information to reproduce PWs with the correct periodicity and evolutionary characteristics. Speech is synthesized by interpolating the reconstructed PWs after gain adjustment and filtering it using the short-term predictor and a postfilter. The design of a 4-kb/s and a 2.4-kb/s FDI codec are presented in this paper and their performance is characterized in terms of delay, complexity, and subjective voice quality. The results confirm that FDI techniques have the potential for delivering high-quality speech at low bit-rates in a cost-effective manner.
Keywords
decoding; frequency-domain analysis; interpolation; speech codecs; speech coding; vector quantisation; vocoders; decoder; frequency domain interpolative techniques; low bit-rate voice compression; switched backward adaptive VQ; vector quantization; voice codec; Decoding; Delay; Fault detection; Filtering; Frequency domain analysis; Frequency measurement; Interpolation; Phase measurement; Prototypes; Speech synthesis; Frequency domain interpolation (FDI); linear prediction; prototype waveform interpolation; voice coding;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TSA.2005.857803
Filename
1597260
Link To Document