Title :
Multi-source neural networks for speech recognition
Author :
Gemello, Roberto ; Albesano, Dario ; Mana, Franco
Author_Institution :
CSELT, Torino, Italy
Abstract :
In speech recognition the most diffused technology (hidden Markov models) is constrained by the condition of stochastic independence of its input features. That limits the simultaneous use of features derived from the speech signal with different processing algorithms. On the contrary artificial neural networks (ANN) are capable of incorporating multiple heterogeneous input features, which do not need to be treated as independent, finding the optimal combination of these features for classification. The purpose of this work is the exploitation of this characteristic of ANNs to improve the speech recognition accuracy through the combined use of input features coming from different sources (different feature extraction algorithms). We integrate two input sources: the Mel based cepstral coefficients (MFCC) derived from FFT and the RASTA-PLP cepstral coefficients. The results show that this integration leads to an error reduction of 26% on a telephone quality test set
Keywords :
feature extraction; hidden Markov models; multilayer perceptrons; neural net architecture; probability; speech recognition; state estimation; Mel based cepstral coefficients; RASTA-PLP cepstral coefficients; heterogeneous input features; multi-source neural networks; telephone quality test set; Artificial neural networks; Cepstral analysis; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Neural networks; Signal processing; Speech processing; Speech recognition; Stochastic processes;
Conference_Titel :
Neural Networks, 1999. IJCNN '99. International Joint Conference on
Conference_Location :
Washington, DC
Print_ISBN :
0-7803-5529-6
DOI :
10.1109/IJCNN.1999.835942