• DocumentCode
    394204
  • Title

    Generalized tandem feature extraction

  • Author

    Sivadas, Sunil ; Hermansky, Hynek

  • Author_Institution
    Oregon Graduate Inst. of Sci. & Technol., Portland, OR, USA
  • Volume
    1
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    We study the use of a generalized multilayer perceptron (MLP) architecture to tandem feature extraction. In the tandem feature extraction scheme an MLP with a softmax output layer is discriminatively trained to estimate phoneme posterior probabilities on a labeled database. The outputs of the MLP after nonlinear transformation and whitening are used as features in a Gaussian mixture model (GMM) based speech recognizer. We consider three layer MLPs with a linear output layer. They nonlinearly transform the input data to a higher dimensional space defined by the output of hidden units and perform linear discriminant analysis (LDA) on the hidden unit outputs. We compare the performances of these features with the direct application of LDA on input data, which is equivalent to MLP with linear hidden and output layers. The tandem features outperform those obtained from LDA and linear output MLPs on a connected digit recognition task.
  • Keywords
    feature extraction; hidden Markov models; learning (artificial intelligence); multilayer perceptrons; parameter estimation; speech recognition; Gaussian mixture model; HMM classifier; connected digit recognition task; generalized multilayer perceptron; linear discriminant analysis; nonlinear transform; phoneme posterior probability estimation; softmax output layer; speech recognition tasks; tandem feature extraction; Cepstral analysis; Computer science; Covariance matrix; Feature extraction; Functional analysis; Hidden Markov models; Linear discriminant analysis; Principal component analysis; Vectors; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1198715
  • Filename
    1198715