• DocumentCode
    336802
  • Title

    Training of HMM with filtered speech material for hands-free recognition

  • Author

    Giuliani, D. ; Matassoni, M. ; Omologo, M. ; Svaizer, P.

  • Author_Institution
    ITC-IRST, Centro per la Ricerca Sci. e Technol., Trento, Italy
  • Volume
    1
  • fYear
    1999
  • fDate
    15-19 Mar 1999
  • Firstpage
    449
  • Abstract
    This paper addresses the problem of hands-free speech recognition in a noisy office environment. An array of six omnidirectional microphones and a corresponding time delay compensation module are used to provide a beamformed signal as input to a HMM-based recognizer. Training of HMMs is performed either using a clean speech database or using a filtered version of the same database. Filtering consists in a convolution with the acoustic impulse response between the speaker and microphone, to reproduce the reverberation effect. Background noise is summed to provide the desired SNR. The paper shows that the new models trained on these data perform better than the baseline ones. Furthermore, the paper investigates on maximum likelihood linear regression (MLLR) adaptation of the new models. It is shown that a further performance improvement is obtained, allowing to reach a 98.7% WRR in a connected digit recognition task, when the talker is at 1.5 m distance from the array
  • Keywords
    acoustic transducer arrays; array signal processing; delays; filtering theory; hidden Markov models; microphones; office environment; reverberation; speech recognition; HMM training; HMM-based recognizer; MLLR adaptation; SNR; acoustic impulse response; background noise; beamformed signal; clean speech database; connected digit recognition task; convolution; distance; filtered speech material; hands-free speech recognition; maximum likelihood linear regression; microphone; noisy office environment; omnidirectional microphone array; performance improvement; reverberation effect; time delay compensation module; word recognition rate; Convolution; Databases; Delay effects; Filtering; Hidden Markov models; Loudspeakers; Maximum likelihood linear regression; Microphone arrays; Speech recognition; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-5041-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1999.758159
  • Filename
    758159