• DocumentCode
    542670
  • Title

    Auditory VOCODER: Speech resynthesis from an auditory Mellin representation

  • Author

    Irino, T. ; Patterson, R.D. ; Kawahara, H.

  • Author_Institution
    NTT Communication Science Laboratories, Japan
  • Volume
    2
  • fYear
    2002
  • fDate
    13-17 May 2002
  • Abstract
    We assume that speech rnorphing, noise suppression, and speech segregation would improve if they were more accurately based on human perception. Accordingly, an Auditory VOCODER was developed to resynthesize speech from an auditory Mellin representation used to explain human perception. The Auditory VOCODER has three modules: an Auditory Mellin Image model [9,10], a STRAIGHT VOCODER [2], and a mapping module consisting of warped-frequency cepstral analysis and nonlinear, multivariate regression analysis (MRA). We describe the modules and an evaluation of the system. Informal listening indicates that the sound quality is reasonable.
  • Keywords
    Computational modeling; Computer languages; Degradation; Erbium; Optical character recognition software;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
  • Conference_Location
    Orlando, FL, USA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7402-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2002.5745004
  • Filename
    5745004