DocumentCode :
1420393
Title :
Multistep coding of speech parameters for compression
Author :
Baghai-Ravary, Ladan ; Beet, Steve W.
Author_Institution :
Aculab plc, Milton Keynes, UK
Volume :
6
Issue :
5
fYear :
1998
fDate :
9/1/1998 12:00:00 AM
Firstpage :
435
Lastpage :
444
Abstract :
This paper presents specific new techniques for coding of speech representations and a new general approach to coding for compression that directly utilizes the multidimensional nature of the input data. Many methods of speech analysis yield a two-dimensional (2-D) pattern, with time as one of the dimensions. Various such speech representations, and power spectrum sequences in particular, are shown here to be amenable to 2-D compression using specific models which take account of a large part of their structure in both dimensions. Newly developed techniques, multistep adaptive flux interpolation (MAFI) and multistep flow-based prediction (MFBP) are presented. These are able to code power spectral density (PSD) sequences of speech more completely and accurately than conventional methods. This is due to their ability to model nonstationary, but piecewise-continuous, signals, of which speech is a good example. Initially, MAFI and MFBP are applied in the time domain, then reapplied to the encoded data in the second dimension. This approach allows the coding algorithm to exploit redundancy in both dimensions, giving a significant improvement in the overall compression ratio. Furthermore, the compression may be reapplied several times. The data is further compressed with each application
Keywords :
adaptive signal processing; data communication; interpolation; prediction theory; sequences; signal representation; spectral analysis; speech coding; 2D compression; coding algorithm; compression ratio; input data; linear AR model; multistep adaptive flux interpolation; multistep coding; multistep flow-based prediction; nonstationary signals; piecewise-continuous signals; power spectral density sequences; redundancy; speech analysis; speech coding; speech parameters; speech representations; time domain; Automatic speech recognition; Interpolation; Multidimensional systems; Oral communication; Signal processing; Speech analysis; Speech coding; Speech processing; Two dimensional displays; Vectors;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.709669
Filename :
709669
Link To Document :
بازگشت