DocumentCode :
2933257
Title :
Vocal tract modelling with recurrent neural networks
Author :
Burrows, T.L. ; Niranjan, M.
Author_Institution :
Dept. of Eng., Cambridge Univ., UK
Volume :
5
fYear :
1995
fDate :
9-12 May 1995
Firstpage :
3315
Abstract :
The speech production system is modelled using true glottal excitation as the source and a recurrent neural network to represent the vocal tract. The hidden nodes have multiple delays of one and two samples, making the network equivalent to a parallel formant synthesiser in the linear regions of the hidden node sigmoids. An ARX model identification is carried out to initialise the neural network parameters. These parameters are re-estimated in an analysis-by-synthesis framework to minimise the synthesis (output) error. Unlike other analysis-by-synthesis speech production models such as CELP, the source and filter in this approach are decoupled, enabling manipulation of the source time-scale to achieve high quality pitch changes
Keywords :
IIR filters; delays; digital filters; error analysis; parameter estimation; recurrent neural nets; speech processing; speech synthesis; ARX model identification; analysis-by-synthesis framework; filter; hidden node sigmoids; linear regions; multiple delays; parallel formant synthesiser; pitch changes; recurrent neural networks; source time-scale; speech production; synthesis error; true glottal excitation; vocal tract modelling; Acoustic distortion; Network synthesis; Neural networks; Nonlinear distortion; Nonlinear filters; Production systems; Recurrent neural networks; Speech analysis; Speech synthesis; Vocoders;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location :
Detroit, MI
ISSN :
1520-6149
Print_ISBN :
0-7803-2431-5
Type :
conf
DOI :
10.1109/ICASSP.1995.479694
Filename :
479694
Link To Document :
بازگشت