• DocumentCode
    454589
  • Title

    Towards ASR Based on Hierarchical Posterior-Based Keyword Recognition

  • Author

    Fousek, Petr ; Hermansky, Hynek

  • Author_Institution
    IDIAP Res. Inst., Martigny
  • Volume
    1
  • fYear
    2006
  • fDate
    14-19 May 2006
  • Abstract
    The paper presents an alternative approach to automatic recognition of speech in which each targeted word is classified by a separate binary classifier against all other sounds. No time alignment is done. To build a recognizer for N words, N parallel binary classifiers are applied. The system first estimates uniformly sampled posterior probabilities of phoneme classes, followed by a second step in which a rather long sliding time window is applied to the phoneme posterior estimates and its content is classified by an artificial neural network to yield posterior probability of the keyword. On a small vocabulary ASR task, the system still does not reach the performance of the state-of-the-art system but its conceptual simplicity, the ease of adding new target words, and its inherent resistance to out-of-vocabulary sounds may prove significant advantage in many applications
  • Keywords
    neural nets; speech recognition; ASR; artificial neural network; automatic speech recognition; hierarchical posterior-based keyword recognition; out-of-vocabulary sounds; parallel binary classifiers; phoneme classes; phoneme posterior estimates; sliding time window; uniformly sampled posterior probabilities; Artificial neural networks; Automatic speech recognition; Humans; Natural languages; Oral communication; Spectral analysis; Speech processing; Target recognition; Vocabulary; Yield estimation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
  • Conference_Location
    Toulouse
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0469-X
  • Type

    conf

  • DOI
    10.1109/ICASSP.2006.1660050
  • Filename
    1660050