Title :
Auditory VOCODER: Speech resynthesis from an auditory Mellin representation
Author :
Irino, T. ; Patterson, R.D. ; Kawahara, H.
Author_Institution :
NTT Communication Science Laboratories, Japan
Abstract :
We assume that speech rnorphing, noise suppression, and speech segregation would improve if they were more accurately based on human perception. Accordingly, an Auditory VOCODER was developed to resynthesize speech from an auditory Mellin representation used to explain human perception. The Auditory VOCODER has three modules: an Auditory Mellin Image model [9,10], a STRAIGHT VOCODER [2], and a mapping module consisting of warped-frequency cepstral analysis and nonlinear, multivariate regression analysis (MRA). We describe the modules and an evaluation of the system. Informal listening indicates that the sound quality is reasonable.
Keywords :
Computational modeling; Computer languages; Degradation; Erbium; Optical character recognition software;
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
Print_ISBN :
0-7803-7402-9
DOI :
10.1109/ICASSP.2002.5745004