مرکز منطقه ای اطلاع رساني علوم و فناوري - Rate-distortion optimal sinusoidal modeling of audio and speech using psychoacoustical matching pursuits

DocumentCode :

542650

Title :

Rate-distortion optimal sinusoidal modeling of audio and speech using psychoacoustical matching pursuits

Author :

Heusdens, Richard ; van de Par, Steven

Author_Institution :

Dept. of Mediamatics, Delft University of Technology, 2628 CD, The Netherlands

Volume :

fYear :

2002

fDate :

13-17 May 2002

Abstract :

In this paper, we propose a rate-distortion optimal algorithm for sinusoidal modeling of audio and speech. The algorithm uses a variable-length analysis window where the total number of sinusoids needed to model the source signal is optimally distributed over the segments. To account for human auditory perception, we use a new perceptually relevant distortion measure which is combined with the psychoacoustical matching pursuit algorithm to select the desired sinusoidal components. We discuss the encoding of the segmentation information and show how to reduce this overhead by restricting the minimum and maximum segment size of the constituent segments. Although this restricts the number of possible partitionings of the input signal, we still have a high accuracy in time at which new segments can start. By doing so, we can decrease the segmentation overhead by 50%, almost without loss of coding efficiency and without introducing pre-echoes.

Keywords :

Encoding; Lakes; Speech;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on

Conference_Location :

Orlando, FL, USA

ISSN :

1520-6149

Print_ISBN :

0-7803-7402-9

Type :

conf

DOI :

10.1109/ICASSP.2002.5744975

Filename :

5744975

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=542650