• DocumentCode
    323483
  • Title

    A novel feature-extraction for speech recognition based on multiple acoustic-feature planes

  • Author

    Nitta, Tsuneo

  • Author_Institution
    Multimedia Eng. Lab., Toshiba Corp., Kawasaki, Japan
  • Volume
    1
  • fYear
    1998
  • fDate
    12-15 May 1998
  • Firstpage
    29
  • Abstract
    This paper describes an attempt to incorporate the functions of the auditory nerve system into the feature extractor of speech recognition. The functions include four types of well-known responses to sound stimuli: the local peaks of the steady sound spectrum, ascending FM sound, descending FM sound, and sharply rising and falling sound. Each function is realized in the form of a three-level derivative operator and is applied to a time-spectrum (TS) pattern X(t,f) of the output of the BPF with 26-channels. The resultant acoustic cue of an input speech represented by multiple acoustic-feature planes (MAFP) is compressed by using the Karhuenen-Loeve transform (KLT), then classified. In the experiments performed on a Japanese E-set (12 consonantal parts of /Ci/) extracted from continuous speech, the MAFP significantly improved the error rate from 34.5% and 29.6% obtained by X(t,f) and X(t,f)+ΔtX(t,f) to 17.0% for unknown speakers (dimension=64)
  • Keywords
    acoustic signal processing; feature extraction; frequency modulation; hearing; pattern classification; spectral analysis; speech processing; speech recognition; transforms; Japanese E-set; KLT; acoustic cue; ascending FM sound; auditory nerve system; continuous speech; descending FM sound; error rate; experiments; feature-extraction; input speech; local peaks; multiple acoustic-feature planes; sharply falling sound; sharply rising sound; sound stimuli; speech recognition; steady sound; three-level derivative operator; time-spectrum pattern; Acoustical engineering; Band pass filters; Electronic mail; Error analysis; Feature extraction; Karhunen-Loeve transforms; Laboratories; Multimedia systems; Radio frequency; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
  • Conference_Location
    Seattle, WA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-4428-6
  • Type

    conf

  • DOI
    10.1109/ICASSP.1998.674359
  • Filename
    674359