• DocumentCode
    1443840
  • Title

    Analysis of MLP-Based Hierarchical Phoneme Posterior Probability Estimator

  • Author

    Pinto, Joel ; Garimella, Sivaram ; Magimai-Doss, Mathew ; Hermansky, Hynek ; Bourlard, Hervé

  • Author_Institution
    Idiap Res. Inst., Martigny, Switzerland
  • Volume
    19
  • Issue
    2
  • fYear
    2011
  • Firstpage
    225
  • Lastpage
    241
  • Abstract
    We analyze a simple hierarchical architecture consisting of two multilayer perceptron (MLP) classifiers in tandem to estimate the phonetic class conditional probabilities. In this hierarchical setup, the first MLP classifier is trained using standard acoustic features. The second MLP is trained using the posterior probabilities of phonemes estimated by the first, but with a long temporal context of around 150-230 ms. Through extensive phoneme recognition experiments, and the analysis of the trained second MLP using Volterra series, we show that 1) the hierarchical system yields higher phoneme recognition accuracies-an absolute improvement of 3.5% and 9.3% on TIMIT and CTS respectively-over the conventional single MLP-based system, 2) there exists useful information in the temporal trajectories of the posterior feature space, spanning around 230 ms of context, 3) the second MLP learns the phonetic temporal patterns in the posterior features, which include the phonetic confusions at the output of the first MLP as well as the phonotactics of the language as observed in the training data, and 4) the second MLP classifier requires fewer number of parameters and can be trained using lesser amount of training data.
  • Keywords
    Volterra series; multilayer perceptrons; probability; speech processing; MLP classifier; Volterra series; hierarchical architecture; hierarchical phoneme posterior probability estimator; multilayer perceptron classifiers; phonetic class conditional probabilities; phonotactics; Art; Automatic speech recognition; Hidden Markov models; Hierarchical systems; Iron; Mel frequency cepstral coefficient; Multilayer perceptrons; Parametric statistics; Probability; Training data; Hierarchical systems; Volterra series; multilayer perceptrons; posterior probabilities;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2010.2045943
  • Filename
    5432979