Title :
Perceptual-MVDR based analysis-synthesis of pitch synchronous frames for pitch modification
Author :
Muralishankar, R. ; Shanker, M. Ravi ; Ramakrishnan, A.G.
Author_Institution :
Dept. of Telecommun. Eng., PESIT, Bangalore
fDate :
June 23 2008-April 26 2008
Abstract :
In our earlier work [1, 2], we employed minimum variance distortionless response (MVDR) and MVDR Bauer respectively, as spectral estimation techniques in place of modified-linear prediction in Discrete cosine transform (DCT) based pitch modification [3]. As a general extension, we introduce psychoacoustic characteristics to [1, 2] resulting in Perceptual-MVDR (PMVDR) and PMVDR-Bauer algorithms utilized here for spectral estimation. Further, we employ Bauer method of spectral factorization in our later algorithm since it results in causal inverse filter. These are used to obtain residual signal from pitch synchronous speech frames. The residual signal is resampled using DCT/IDCT depending on the target pitch scale factor. Finally, forward filters realized from the above factorization are used to get pitch modified speech. The modified speech is evaluated subjectively by 10 listeners and mean opinion scores (MOS) are evaluated for pitch factors from 0.5 to 2. Modified bark spectral distortion (MBSD) measure is also employed to evaluate objective performance. We found that the proposed approach has been rated with higher MOS and has achieved lower MBSD than the time domain pitch synchronous overlap [4], modified-LP method [3] and MVDR based methods [1, 2]. Further, we modified the pitch contours of 20 affirmative sentences to sound like interrogative sentences, using the current as well as our earlier algorithms and compared their performance.
Keywords :
discrete cosine transforms; matrix decomposition; speech processing; speech synthesis; analysis-synthesis; discrete cosine transform; mean opinion scores; minimum variance distortionless response; modified-linear prediction; perceptual-MVDR; pitch modification; pitch synchronous; spectral factorization; Discrete cosine transforms; Distortion measurement; Filters; Frequency domain analysis; Intelligent systems; Psychoacoustic models; Psychology; Signal processing; Speech analysis; Speech processing;
Conference_Titel :
Multimedia and Expo, 2008 IEEE International Conference on
Conference_Location :
Hannover
Print_ISBN :
978-1-4244-2570-9
Electronic_ISBN :
978-1-4244-2571-6
DOI :
10.1109/ICME.2008.4607376