Title : 
On optimal modelling of speech spectral transitions
         
        
            Author : 
Athaudage, C.R.N. ; Lech, M.
         
        
            Author_Institution : 
ARC Special Res. Center for Ultra-Broadband Inf. Networks, Melbourne Univ., Vic., Australia
         
        
        
        
        
        
            Abstract : 
In this paper, we propose an optimal spectral transition modelling technique for speech. The proposed technique optimizes the spectral interpolation trajectory by minimizing the mean-square-error of spectral parameters on a frame-by-frame basis. The performance of the proposed techniques is compared with that of two spectral interpolation techniques, namely the linear interpolation and the Gaussian interpolation, reported in literature. Line spectral frequencies are used as the short-term spectral parameter representation of the speech signal. The regions between maximally stable (stationary) frames in the spectral parameter sequence are identified as the regions of spectral transitions. Numerical results show that both linear and Gaussian interpolation techniques have similar modelling performance in terms of average spectral distortion. The proposed optimal technique shows an improved modelling accuracy in terms of average spectral distortion (up to 1 dB improvement), in comparison to that of the linear and Gaussian techniques. The proposed technique can be useful for speech processing applications such as coding and recognition.
         
        
            Keywords : 
Gaussian processes; distortion; interpolation; least mean squares methods; signal representation; spectral analysis; speech coding; speech recognition; Gaussian interpolation; average spectral distortion; line spectral frequency; linear interpolation; mean-square-error; optimal spectral transition modelling; short-term spectral parameter representation; spectral interpolation trajectory; speech coding; speech processing; speech recognition; speech spectral transition; Analytical models; Filters; Interpolation; Predictive models; Size control; Size measurement; Speech coding; Speech processing; Speech recognition; Speech synthesis;
         
        
        
        
            Conference_Titel : 
Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint Conference of the Fourth International Conference on
         
        
            Print_ISBN : 
0-7803-8185-8
         
        
        
            DOI : 
10.1109/ICICS.2003.1292680