• DocumentCode
    417124
  • Title

    Non-uniform speaker normalization using affine-transformation

  • Author

    Kumar, S.V.B. ; Umesh, S. ; Sinha, Rohit

  • Author_Institution
    Imaging Technol. Lab, Gen. Electr.-Global Res., Bangalore, India
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    We propose a mathematical model to describe the relation between the formant frequencies of speakers and show that with the proposed affine model, speaker differences separate out as translation factors when a "Mel-like" warping is performed. Using speech data, we estimate the parameters of this warping function and show that it is close to the usual Mel-formula. This model is motivated by Rohit Sinha and S. Umesh\´s shift-based non-uniform speaker-normalization method (see Proc. IEEE ICASSP, 2002), which provides improvement over conventional maximum-likelihood based speaker normalization methods. We therefore provide a unified framework that relates the relationship between formants of speakers and the method of removing speaker differences (which involves Mel-warping) in a neat mathematical framework which is substantiated by our recognition experiments.
  • Keywords
    parameter estimation; speaker recognition; speech processing; Mel-warping; affine transformation; formant frequencies; mathematical model; nonuniform speaker normalization; parameter estimation; speaker differences; speaker recognition; Auditory system; Databases; Frequency estimation; Loudspeakers; Mathematical model; Maximum likelihood estimation; Parameter estimation; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1325937
  • Filename
    1325937