DocumentCode
540203
Title
Effect of neural network input span on phoneme classification
Author
Kamm, C.A. ; Singhal, S.
fYear
1990
fDate
17-21 June 1990
Firstpage
195
Abstract
Feedforward neural networks spanning input durations of 35, 65, 125, and 245 ns were trained to classify subword speech segments from continuous utterances into 46 phoneme-like classes. The performance of networks with different input spans showed that brief sounds can be reliably detected by networks with longer input spans, but results for a subset of longer-duration phonemes (diphthongs) indicate that a network needs a wide enough view of the input to capture the salient features of the output class. The best classification performance was observed using a completely connected three-layer network with a 125-ns input span. Performance averaged 56%, ranging from 31% to 82% across phoneme classes using a top-three candidates decision criterion
Keywords
neural nets; speech analysis and processing; speech recognition; 35 to 245 ns; continuous utterances; neural network input span; phoneme classification; subword speech segments; three-layer network;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks, 1990., 1990 IJCNN International Joint Conference on
Conference_Location
San Diego, CA, USA
Type
conf
DOI
10.1109/IJCNN.1990.137569
Filename
5726529
Link To Document