DocumentCode
542670
Title
Auditory VOCODER: Speech resynthesis from an auditory Mellin representation
Author
Irino, T. ; Patterson, R.D. ; Kawahara, H.
Author_Institution
NTT Communication Science Laboratories, Japan
Volume
2
fYear
2002
fDate
13-17 May 2002
Abstract
We assume that speech rnorphing, noise suppression, and speech segregation would improve if they were more accurately based on human perception. Accordingly, an Auditory VOCODER was developed to resynthesize speech from an auditory Mellin representation used to explain human perception. The Auditory VOCODER has three modules: an Auditory Mellin Image model [9,10], a STRAIGHT VOCODER [2], and a mapping module consisting of warped-frequency cepstral analysis and nonlinear, multivariate regression analysis (MRA). We describe the modules and an evaluation of the system. Informal listening indicates that the sound quality is reasonable.
Keywords
Computational modeling; Computer languages; Degradation; Erbium; Optical character recognition software;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location
Orlando, FL, USA
ISSN
1520-6149
Print_ISBN
0-7803-7402-9
Type
conf
DOI
10.1109/ICASSP.2002.5745004
Filename
5745004
Link To Document