• DocumentCode
    540203
  • Title

    Effect of neural network input span on phoneme classification

  • Author

    Kamm, C.A. ; Singhal, S.

  • fYear
    1990
  • fDate
    17-21 June 1990
  • Firstpage
    195
  • Abstract
    Feedforward neural networks spanning input durations of 35, 65, 125, and 245 ns were trained to classify subword speech segments from continuous utterances into 46 phoneme-like classes. The performance of networks with different input spans showed that brief sounds can be reliably detected by networks with longer input spans, but results for a subset of longer-duration phonemes (diphthongs) indicate that a network needs a wide enough view of the input to capture the salient features of the output class. The best classification performance was observed using a completely connected three-layer network with a 125-ns input span. Performance averaged 56%, ranging from 31% to 82% across phoneme classes using a top-three candidates decision criterion
  • Keywords
    neural nets; speech analysis and processing; speech recognition; 35 to 245 ns; continuous utterances; neural network input span; phoneme classification; subword speech segments; three-layer network;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 1990., 1990 IJCNN International Joint Conference on
  • Conference_Location
    San Diego, CA, USA
  • Type

    conf

  • DOI
    10.1109/IJCNN.1990.137569
  • Filename
    5726529