Title :
Minimum mean squared error time series classification using an echo state network prediction model
Author :
Skowronski, Mark D. ; Harris, John G.
Author_Institution :
Dept. of Electr. & Comput. Eng., Florida Univ., Gainesville, FL
Abstract :
The echo state network (ESN) has been recently proposed as an alternative recurrent neural network model. An ESN consists of a reservoir of conventional processing elements, which are recurrently interconnected with untrained random weights, and a readout layer, which is trained using linear regression methods. The key advantage of the ESN is the ability to model systems without the need to train the recurrent weights. In this paper, we use an ESN to model the production of speech signals in a classification experiment using isolated utterances of the English digits "zero" through "nine." One prediction model for each digit was trained using frame-based speech features (cepstral coefficients) from all train utterances, and the readout layer consisted of several linear regressors which were trained to target different portions of the time series using a dynamic programming algorithm (Viterbi). Each novel test utterance was classified with the label from the digit model with the minimum mean squared prediction error. Using a corpus of 4130 isolated digits from 8 male and 8 female speakers, the highest classification accuracy attained with an ESN was 100.0% (99.1%) on the train (test) set, compared to 100% (94.7%) for a hidden Markov model (HMM). HMM performance increased to 100.0% (99.8%) when context features (first- and second-order temporal derivatives) were appended to the cepstral coefficients. The ESN offers an attractive alternative to the HMM because of the ESN\´s simple train procedure, low computational requirements, and inherent ability to model the dynamics of the signal under study
Keywords :
cepstral analysis; hidden Markov models; least mean squares methods; recurrent neural nets; signal classification; speech processing; cepstral coefficients; dynamic programming algorithm; echo state network prediction model; frame-based speech features; hidden Markov model; isolated utterances; linear regression methods; minimum mean squared error; readout layer; recurrent neural network model; recurrent weights; speech signals; time series classification; train utterances; untrained random weights; Cepstral analysis; Dynamic programming; Heuristic algorithms; Hidden Markov models; Linear regression; Predictive models; Recurrent neural networks; Reservoirs; Speech; Testing;
Conference_Titel :
Circuits and Systems, 2006. ISCAS 2006. Proceedings. 2006 IEEE International Symposium on
Conference_Location :
Island of Kos
Print_ISBN :
0-7803-9389-9
DOI :
10.1109/ISCAS.2006.1693294