Title :
A Sinusoidal Voice Over Packet Coder Tailored for the Frame-Erasure Channel
Author_Institution :
Skype Technol., Stockholm, Sweden
Abstract :
A speech coder tailored especially for the frame-erasure channel—the sinusoidal voice over packet coder (SVOPC)—is proposed. Based on a classified approach, avoiding interframe coding techniques, and synthesizing its output from slowly varying parameters, the coder is inherently robust to packet loss. SVOPC is based on quasi-harmonic modeling of the linear prediction (LP) residual. Both the sinusoidal amplitudes and phases are explicitly encoded using new methods based on Gaussian mixture models. A wide-band (16-kHz sampling frequency) implementation of the coder provides synthesized speech of good subjective quality at around 20 kbps. SVOPC is evaluated by means of subjective listening tests, and compared to a reference system based on G.722.2 (the AMR wide-band codec). Under frame erasure conditions (5%–30% frame erasures generated according to a Gilbert model), SVOPC clearly outperforms G.722.2.
Keywords :
channel coding; linear predictive coding; speech coding; speech synthesis; vocoders; Gaussian mixture model; frame erasure channel; interframe coding technique; linear prediction residual; quasi harmonic modeling; sinusoidal voice over packet coder; speech coder; speech synthesis; Frequency synthesizers; IP networks; Predictive models; Protocols; Quality of service; Robustness; Sampling methods; Speech coding; Speech synthesis; Wideband; Frame-erasure; Gaussian mixture model; harmonic analysis; packet loss concealment; packet switching; speech coding; variable-dimension; vector quantization; wide-band;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
DOI :
10.1109/TSA.2005.851913