DocumentCode :
542650
Title :
Rate-distortion optimal sinusoidal modeling of audio and speech using psychoacoustical matching pursuits
Author :
Heusdens, Richard ; van de Par, Steven
Author_Institution :
Dept. of Mediamatics, Delft University of Technology, 2628 CD, The Netherlands
Volume :
2
fYear :
2002
fDate :
13-17 May 2002
Abstract :
In this paper, we propose a rate-distortion optimal algorithm for sinusoidal modeling of audio and speech. The algorithm uses a variable-length analysis window where the total number of sinusoids needed to model the source signal is optimally distributed over the segments. To account for human auditory perception, we use a new perceptually relevant distortion measure which is combined with the psychoacoustical matching pursuit algorithm to select the desired sinusoidal components. We discuss the encoding of the segmentation information and show how to reduce this overhead by restricting the minimum and maximum segment size of the constituent segments. Although this restricts the number of possible partitionings of the input signal, we still have a high accuracy in time at which new segments can start. By doing so, we can decrease the segmentation overhead by 50%, almost without loss of coding efficiency and without introducing pre-echoes.
Keywords :
Encoding; Lakes; Speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5744975
Filename :
5744975
Link To Document :
بازگشت