Title : 
Wrapped Gaussian Mixture Models for Modeling and High-Rate Quantization of Phase Data of Speech
         
        
            Author : 
Agiomyrgiannakis, Yannis ; Stylianou, Yannis
         
        
            Author_Institution : 
Found. of Res. & Technol. Hellas, Inst. of Comput. Sci., Heraklion
         
        
        
        
        
            fDate : 
5/1/2009 12:00:00 AM
         
        
        
        
            Abstract : 
The harmonic representation of speech signals has found many applications in speech processing. This paper presents a novel statistical approach to model the behavior of harmonic phases. Phase information is decomposed into three parts: a minimum phase part, a translation term, and a residual term referred to as dispersion phase. Dispersion phases are modeled by wrapped Gaussian mixture models (WGMMs) using an expectation-maximization algorithm suitable for circular vector data. A multivariate WGMM-based phase quantizer is then proposed and constructed using novel scalar quantizers for circular random variables. The proposed phase modeling and quantization scheme is evaluated in the context of a narrowband harmonic representation of speech. Results indicate that it is possible to construct a variable-rate harmonic codec that is equivalent to iLBC at approximately 13 kbps.
         
        
            Keywords : 
Gaussian processes; harmonic analysis; quantisation (signal); speech coding; circular vector data; dispersion phase; expectation-maximization algorithm; harmonic representation; high-rate quantization scheme; phase data; phase modeling; speech coding; speech processing; speech signal; statistical approach; variable-rate harmonic codec; wrapped Gaussian mixture model; Codecs; Computer science; Context modeling; Expectation-maximization algorithms; Power harmonic filters; Quantization; Signal processing; Speech analysis; Speech coding; Speech processing; Circular statistics; phase quantization; sinusoidal models; speech analysis; speech coding; voice-over-IP; wrapped Gaussian mixture models (WGMMs);
         
        
        
            Journal_Title : 
Audio, Speech, and Language Processing, IEEE Transactions on
         
        
        
        
        
            DOI : 
10.1109/TASL.2008.2008229