مرکز منطقه ای اطلاع رساني علوم و فناوري - Maximum entropy direct models for speech recognition

DocumentCode :

3243987

Title :

Maximum entropy direct models for speech recognition

Author :

Kuo, Hong-Kwang Jeff ; Gao, Yuqing

Author_Institution :

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

2003

fDate :

30 Nov.-3 Dec. 2003

Firstpage :

Lastpage :

Abstract :

Traditional statistical models for speech recognition have all been based on a Bayesian framework using generative models such as hidden Markov models (HMMs). The paper focuses on a new framework for speech recognition using maximum entropy direct modeling, where the probability of a state or word sequence given an observation sequence is computed directly from the model. In contrast to HMMs, features can be asynchronous and overlapping. This model therefore allows for the potential combination of many different types of features. A specific kind of direct model, the maximum entropy Markov model (MEMM), is studied. Even with conventional acoustic features, the approach already shows promising results for phone level decoding. The MEMM significantly outperforms traditional HMMs in word error rate when used as stand-alone acoustic models. Preliminary results combining the MEMM scores with HMM and language model scores show modest improvements over the best HMM speech recognizer.

Keywords :

error statistics; hidden Markov models; maximum entropy methods; probability; speech recognition; Bayesian framework; HMM; acoustic features; hidden Markov models; maximum entropy Markov model; maximum entropy direct models; phone level decoding; probability; speech recognition; statistical models; word error rate; Bayesian methods; Decoding; Entropy; Equations; Error analysis; Hidden Markov models; Natural languages; Probability; Speech recognition; State-space methods;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on

Print_ISBN :

0-7803-7980-2

Type :

conf

DOI :

10.1109/ASRU.2003.1318394

Filename :

1318394

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3243987