• DocumentCode
    696995
  • Title

    Representing speech

  • Author

    Kleijn, W.Bastiaan

  • Author_Institution
    Department of Speech, Music and Hearing, KTH (Royal Institute of Technology), 100 44 Stockholm, Sweden
  • fYear
    2000
  • fDate
    4-8 Sept. 2000
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    The properties of the speech production process and the auditory periphery have led to the usage of similar speech signal representations for various processing tasks such as speech and speaker recognition, speech synthesis, and speech coding. The representation is generally divided into a description of the vocal-tract transfer function and the excitation source. For recognition purposes, the biased characterization of the vocal-tract transfer function by a time sequence of low-dimension cepstral vectors performs well. For coding and synthesis, we argue that for the vocal-tract transfer function autoregressive (AR) models are more effective than filter banks, while for the excitation source pitch-synchronous filter banks and modulation-domain filters are most effective. A clear trend exists towards the exploitation of the time variation of both the vocal-tract transfer function and the excitation source.
  • Keywords
    Mel frequency cepstral coefficient; Modulation; Speech; Speech processing; Speech recognition; Transfer functions;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference, 2000 10th European
  • Conference_Location
    Tampere, Finland
  • Print_ISBN
    978-952-1504-43-3
  • Type

    conf

  • Filename
    7075841