Auditory VOCODER: Speech resynthesis from an auditory Mellin representation

Author

Irino, T. ; Patterson, R.D. ; Kawahara, H.

Author_Institution

NTT Communication Science Laboratories, Japan

Volume

fYear

2002

fDate

13-17 May 2002

Abstract

We assume that speech rnorphing, noise suppression, and speech segregation would improve if they were more accurately based on human perception. Accordingly, an Auditory VOCODER was developed to resynthesize speech from an auditory Mellin representation used to explain human perception. The Auditory VOCODER has three modules: an Auditory Mellin Image model [9,10], a STRAIGHT VOCODER [2], and a mapping module consisting of warped-frequency cepstral analysis and nonlinear, multivariate regression analysis (MRA). We describe the modules and an evaluation of the system. Informal listening indicates that the sound quality is reasonable.

Keywords

Computational modeling; Computer languages; Degradation; Erbium; Optical character recognition software;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on

Conference_Location

Orlando, FL, USA

ISSN

1520-6149

Print_ISBN

0-7803-7402-9

Type

conf

DOI

10.1109/ICASSP.2002.5745004

Filename

5745004

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=542670