• DocumentCode
    2403697
  • Title

    Robustness of a chaotic modal neural network applied to audio-visual speech recognition

  • Author

    Kabré, Harouna

  • Author_Institution
    Univ. Joseph Fourier, Grenoble, France
  • fYear
    1997
  • fDate
    24-26 Sep 1997
  • Firstpage
    607
  • Lastpage
    616
  • Abstract
    We stabilized a chaotic modal neural network (MNN) for the purpose of robust speech recognition. A modal neural network is an artificial neural network system which includes two levels of information processing. The first level is trained to store and retrieve some acoustic and visual patterns. The different states of this network, which represent the sound classes in a task of speech recognition, are called modes and are supposed to chaotically evolve when speech recognition is performed in adverse environments. The control of the chaotic behavior of the different modes constitutes the second level. An external signal, taken from a visual input such as the lip-opening parameters of the speaker is applied to stabilize an acoustic modal network of which the modes are moved from an initial position to a target position. The addressed task is the audio-visual recognition of the 10 French vowels, perturbed by some noises. The perceptual linear predictive analysis applied to the speech signal of the 10 vowels outputs some vectors formed by 5 spectral parameters. They are in turn fed into a modal neural network implemented as a feedforward network. When the noise level increases, the classes stored by the acoustic MNN exhibit a chaotic behavior which is stabilized by the signal given by the visual path. We show that in an uncooperative environment, a chaotic modal neural network stabilizes well
  • Keywords
    acoustic signal processing; chaos; feedforward neural nets; image recognition; speech recognition; French vowels; acoustic modal network; acoustic patterns; audio-visual speech recognition; chaotic behavior; chaotic modal neural network; feedforward network; information processing; lip-opening; perceptual linear predictive analysis; robust speech recognition; speech signal; uncooperative environment; visual patterns; Acoustic noise; Artificial neural networks; Chaos; Information processing; Loudspeakers; Multi-layer neural network; Neural networks; Robustness; Speech analysis; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks for Signal Processing [1997] VII. Proceedings of the 1997 IEEE Workshop
  • Conference_Location
    Amelia Island, FL
  • ISSN
    1089-3555
  • Print_ISBN
    0-7803-4256-9
  • Type

    conf

  • DOI
    10.1109/NNSP.1997.622443
  • Filename
    622443