DocumentCode :
1379487
Title :
Bayesian Networks for Discrete Observation Distributions in Speech Recognition
Author :
Miguel, Antonio ; Ortega, Alfonso ; Buera, Luis ; Lleida, Eduardo
Author_Institution :
Aragon Inst. of Eng. Res. (I3A), Univ. of Zaragoza, Zaragoza, Spain
Volume :
19
Issue :
6
fYear :
2011
Firstpage :
1476
Lastpage :
1489
Abstract :
Traditionally, in speech recognition, the hidden Markov model state emission probability distributions are usually associated to continuous random variables, by using Gaussian mixtures. Thus, complex multimodal inter-feature dependencies are not accurately modeled by Gaussian models, since they are unimodal distributions and mixtures of Gaussians are needed in these complex cases, but this is done in a loose and inefficient way. Graphical models provide a precise and simple mechanism to model the dependencies among two or more variables. This paper proposes the use of discrete random variables as observations and graphical models to extract the internal dependence structure in the feature vectors. Therefore, speech features are quantized to a small number of levels, in order to obtain a tractable model. These quantized speech features provide a mechanism to increase the robustness against noise uncertainty. In addition, discrete random variables allow the learning of joint statistics of the observation densities. A method to estimate a graphical model with a constrained number of dependencies is shown in this paper, being a special kind of Bayesian network. Experimental results show that by using this modeling, better performance can be obtained compared to standard baseline systems.
Keywords :
Gaussian processes; belief networks; hidden Markov models; speech recognition; statistical distributions; Bayesian network; Gaussian mixture; Gaussian model; complex multimodal interfeature dependency; continuous random variable; discrete observation distribution; discrete random variable; graphical model; hidden Markov model state emission probability distribution; quantized speech feature; speech recognition; standard baseline system; unimodal distribution; Approximation methods; Bayesian methods; Covariance matrix; Graphical models; Hidden Markov models; Joints; Speech recognition; Bayesian networks; expectation–maximization; graphical models; maximum likelihood;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2010.2092764
Filename :
5638127
Link To Document :
بازگشت