DocumentCode :
696995
Title :
Representing speech
Author :
Kleijn, W.Bastiaan
Author_Institution :
Department of Speech, Music and Hearing, KTH (Royal Institute of Technology), 100 44 Stockholm, Sweden
fYear :
2000
fDate :
4-8 Sept. 2000
Firstpage :
1
Lastpage :
8
Abstract :
The properties of the speech production process and the auditory periphery have led to the usage of similar speech signal representations for various processing tasks such as speech and speaker recognition, speech synthesis, and speech coding. The representation is generally divided into a description of the vocal-tract transfer function and the excitation source. For recognition purposes, the biased characterization of the vocal-tract transfer function by a time sequence of low-dimension cepstral vectors performs well. For coding and synthesis, we argue that for the vocal-tract transfer function autoregressive (AR) models are more effective than filter banks, while for the excitation source pitch-synchronous filter banks and modulation-domain filters are most effective. A clear trend exists towards the exploitation of the time variation of both the vocal-tract transfer function and the excitation source.
Keywords :
Mel frequency cepstral coefficient; Modulation; Speech; Speech processing; Speech recognition; Transfer functions;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference, 2000 10th European
Conference_Location :
Tampere, Finland
Print_ISBN :
978-952-1504-43-3
Type :
conf
Filename :
7075841
Link To Document :
بازگشت