DocumentCode :
542670
Title :
Auditory VOCODER: Speech resynthesis from an auditory Mellin representation
Author :
Irino, T. ; Patterson, R.D. ; Kawahara, H.
Author_Institution :
NTT Communication Science Laboratories, Japan
Volume :
2
fYear :
2002
fDate :
13-17 May 2002
Abstract :
We assume that speech rnorphing, noise suppression, and speech segregation would improve if they were more accurately based on human perception. Accordingly, an Auditory VOCODER was developed to resynthesize speech from an auditory Mellin representation used to explain human perception. The Auditory VOCODER has three modules: an Auditory Mellin Image model [9,10], a STRAIGHT VOCODER [2], and a mapping module consisting of warped-frequency cepstral analysis and nonlinear, multivariate regression analysis (MRA). We describe the modules and an evaluation of the system. Informal listening indicates that the sound quality is reasonable.
Keywords :
Computational modeling; Computer languages; Degradation; Erbium; Optical character recognition software;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5745004
Filename :
5745004
Link To Document :
بازگشت