مرکز منطقه ای اطلاع رساني علوم و فناوري - Multistep coding of speech parameters for compression

DocumentCode :

1420393

Title :

Multistep coding of speech parameters for compression

Author :

Baghai-Ravary, Ladan ; Beet, Steve W.

Author_Institution :

Aculab plc, Milton Keynes, UK

Volume :

Issue :

fYear :

1998

fDate :

9/1/1998 12:00:00 AM

Firstpage :

435

Lastpage :

444

Abstract :

This paper presents specific new techniques for coding of speech representations and a new general approach to coding for compression that directly utilizes the multidimensional nature of the input data. Many methods of speech analysis yield a two-dimensional (2-D) pattern, with time as one of the dimensions. Various such speech representations, and power spectrum sequences in particular, are shown here to be amenable to 2-D compression using specific models which take account of a large part of their structure in both dimensions. Newly developed techniques, multistep adaptive flux interpolation (MAFI) and multistep flow-based prediction (MFBP) are presented. These are able to code power spectral density (PSD) sequences of speech more completely and accurately than conventional methods. This is due to their ability to model nonstationary, but piecewise-continuous, signals, of which speech is a good example. Initially, MAFI and MFBP are applied in the time domain, then reapplied to the encoded data in the second dimension. This approach allows the coding algorithm to exploit redundancy in both dimensions, giving a significant improvement in the overall compression ratio. Furthermore, the compression may be reapplied several times. The data is further compressed with each application

Keywords :

adaptive signal processing; data communication; interpolation; prediction theory; sequences; signal representation; spectral analysis; speech coding; 2D compression; coding algorithm; compression ratio; input data; linear AR model; multistep adaptive flux interpolation; multistep coding; multistep flow-based prediction; nonstationary signals; piecewise-continuous signals; power spectral density sequences; redundancy; speech analysis; speech coding; speech parameters; speech representations; time domain; Automatic speech recognition; Interpolation; Multidimensional systems; Oral communication; Signal processing; Speech analysis; Speech coding; Speech processing; Two dimensional displays; Vectors;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/89.709669

Filename :

709669

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1420393