• DocumentCode
    835872
  • Title

    Low bit-rate voice compression based on frequency domain interpolative techniques

  • Author

    Bhaskar, Udaya ; Swaminathan, Kumar

  • Author_Institution
    Hughes Network Syst. Inc., Germantown, MD, USA
  • Volume
    14
  • Issue
    2
  • fYear
    2006
  • fDate
    3/1/2006 12:00:00 AM
  • Firstpage
    558
  • Lastpage
    576
  • Abstract
    This paper presents an approach, referred to as frequency domain interpolation (FDI), for achieving high-quality speech at low bit-rates (4 kb/s and below) within reasonable complexity and delay. FDI methods, like the prototype waveform interpolation (PWI) methods, derive a prototype waveform (PW) at regular intervals of time. But, unlike PWI, there is no separation into a slowly evolving waveform (SEW) and a rapidly evolving waveform (REW) component. Instead, the PW is encoded after gain normalization in magnitude-phase form. The magnitude is modeled as a sum of mean and deviation values in multiple frequency bands and this model is quantized using switched backward adaptive VQ techniques. The phase information is represented as a composite vector of PW correlations in multiple frequency bands and an overall voicing measure. This information is quantized using a VQ at the encoder. At the decoder, a phase model is employed that uses the received phase (and magnitude) information to reproduce PWs with the correct periodicity and evolutionary characteristics. Speech is synthesized by interpolating the reconstructed PWs after gain adjustment and filtering it using the short-term predictor and a postfilter. The design of a 4-kb/s and a 2.4-kb/s FDI codec are presented in this paper and their performance is characterized in terms of delay, complexity, and subjective voice quality. The results confirm that FDI techniques have the potential for delivering high-quality speech at low bit-rates in a cost-effective manner.
  • Keywords
    decoding; frequency-domain analysis; interpolation; speech codecs; speech coding; vector quantisation; vocoders; decoder; frequency domain interpolative techniques; low bit-rate voice compression; switched backward adaptive VQ; vector quantization; voice codec; Decoding; Delay; Fault detection; Filtering; Frequency domain analysis; Frequency measurement; Interpolation; Phase measurement; Prototypes; Speech synthesis; Frequency domain interpolation (FDI); linear prediction; prototype waveform interpolation; voice coding;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TSA.2005.857803
  • Filename
    1597260