Title :
Speaker-dependent 1000 word recognition using a large scale neural network `CombNET-II´ and dynamic spectral features
Author :
Kitamura, Tadashi ; Hui, Wei ; Iwata, Akira ; Suzumura, Nobuo
Author_Institution :
Dept. of Electr. & Comput. Eng., Nagoya Inst. of Technol., Japan
Abstract :
The authors describe speaker-dependent large vocabulary word recognition using a large-scale neural network, CombNET-II, which consists of a four-layered neural network with a comb structure, and dynamic spectral features of speech based on a two-dimensional mel-cepstrum. CombNET-II consists of two types of neural networks. The first part is a stem network which learns by a self-growing algorithm and roughly classifies an input pattern. The second part consists of many branch networks which learn by a backpropagation algorithm and precisely classify the input pattern. A stem network is a vector quantizing network and it reduces the number of category candidates for the branch networks, so that each branch network has only a small number of connections and it is easy to tune up. Experiments on speaker-dependent large-vocabulary word recognition for 1000 Chinese spoken words is described. Experimental results show that the high recognition accuracy of 99.1% is obtained and that CombNET-II is very effective for large vocabulary spoken word recognition
Keywords :
learning systems; neural nets; spectral analysis; speech recognition; 2D mel-cepstrum; Chinese; CombNET-II; backpropagation; dynamic spectral features; large-scale neural network; learning systems; self-growing algorithm; speaker independent speech recognition; Computer networks; Feedforward neural networks; Fourier transforms; Frequency domain analysis; Large-scale systems; Neural networks; Neurons; Speaker recognition; Speech recognition; Vocabulary;
Conference_Titel :
Neural Networks, 1991. 1991 IEEE International Joint Conference on
Print_ISBN :
0-7803-0227-3
DOI :
10.1109/IJCNN.1991.170560