مرکز منطقه ای اطلاع رساني علوم و فناوري - Bayesian Networks for Discrete Observation Distributions in Speech Recognition

DocumentCode :

1379487

Title :

Bayesian Networks for Discrete Observation Distributions in Speech Recognition

Author :

Miguel, Antonio ; Ortega, Alfonso ; Buera, Luis ; Lleida, Eduardo

Author_Institution :

Aragon Inst. of Eng. Res. (I3A), Univ. of Zaragoza, Zaragoza, Spain

Volume :

Issue :

fYear :

2011

Firstpage :

1476

Lastpage :

1489

Abstract :

Traditionally, in speech recognition, the hidden Markov model state emission probability distributions are usually associated to continuous random variables, by using Gaussian mixtures. Thus, complex multimodal inter-feature dependencies are not accurately modeled by Gaussian models, since they are unimodal distributions and mixtures of Gaussians are needed in these complex cases, but this is done in a loose and inefficient way. Graphical models provide a precise and simple mechanism to model the dependencies among two or more variables. This paper proposes the use of discrete random variables as observations and graphical models to extract the internal dependence structure in the feature vectors. Therefore, speech features are quantized to a small number of levels, in order to obtain a tractable model. These quantized speech features provide a mechanism to increase the robustness against noise uncertainty. In addition, discrete random variables allow the learning of joint statistics of the observation densities. A method to estimate a graphical model with a constrained number of dependencies is shown in this paper, being a special kind of Bayesian network. Experimental results show that by using this modeling, better performance can be obtained compared to standard baseline systems.

Keywords :

Gaussian processes; belief networks; hidden Markov models; speech recognition; statistical distributions; Bayesian network; Gaussian mixture; Gaussian model; complex multimodal interfeature dependency; continuous random variable; discrete observation distribution; discrete random variable; graphical model; hidden Markov model state emission probability distribution; quantized speech feature; speech recognition; standard baseline system; unimodal distribution; Approximation methods; Bayesian methods; Covariance matrix; Graphical models; Hidden Markov models; Joints; Speech recognition; Bayesian networks; expectation–maximization; graphical models; maximum likelihood;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2010.2092764

Filename :

5638127

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1379487