DocumentCode
696995
Title
Representing speech
Author
Kleijn, W.Bastiaan
Author_Institution
Department of Speech, Music and Hearing, KTH (Royal Institute of Technology), 100 44 Stockholm, Sweden
fYear
2000
fDate
4-8 Sept. 2000
Firstpage
1
Lastpage
8
Abstract
The properties of the speech production process and the auditory periphery have led to the usage of similar speech signal representations for various processing tasks such as speech and speaker recognition, speech synthesis, and speech coding. The representation is generally divided into a description of the vocal-tract transfer function and the excitation source. For recognition purposes, the biased characterization of the vocal-tract transfer function by a time sequence of low-dimension cepstral vectors performs well. For coding and synthesis, we argue that for the vocal-tract transfer function autoregressive (AR) models are more effective than filter banks, while for the excitation source pitch-synchronous filter banks and modulation-domain filters are most effective. A clear trend exists towards the exploitation of the time variation of both the vocal-tract transfer function and the excitation source.
Keywords
Mel frequency cepstral coefficient; Modulation; Speech; Speech processing; Speech recognition; Transfer functions;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference, 2000 10th European
Conference_Location
Tampere, Finland
Print_ISBN
978-952-1504-43-3
Type
conf
Filename
7075841
Link To Document