مرکز منطقه ای اطلاع رساني علوم و فناوري - Large Margin Discriminative Semi-Markov Model for Phonetic Recognition

DocumentCode :

1437037

Title :

Large Margin Discriminative Semi-Markov Model for Phonetic Recognition

Author :

Kim, Sungwoong ; Yun, Sungrack ; Yoo, Chang D.

Author_Institution :

Dept. of Electr. Eng., Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea

Volume :

Issue :

fYear :

2011

Firstpage :

1999

Lastpage :

2012

Abstract :

This paper considers a large margin discriminative semi-Markov model (LMSMM) for phonetic recognition. The hidden Markov model (HMM) framework that is often used for phonetic recognition assumes only local statistical dependencies between adjacent observations, and it is used to predict a label for each observation without explicit phone segmentation. On the other hand, the semi-Markov model (SMM) framework allows simultaneous segmentation and labeling of sequential data based on a segment-based Markovian structure that assumes statistical dependencies among all the observations within a phone segment. For phonetic recognition which is inherently a joint segmentation and labeling problem, the SMM framework has the potential to perform better than the HMM framework at the expense of slight increase in computational complexity. The SMM framework considered in this paper is based on a non-probabilistic discriminant function that is linear in the joint feature map which attempts to capture long-range statistical dependencies among observations. The parameters of the discriminant function are estimated by a large margin learning framework for structured prediction. The parameter estimation problem in hand leads to an optimization problem with many margin constraints, and this constrained optimization problem is solved using a stochastic gradient descent algorithm. The proposed LMSMM outperformed the large margin discriminative HMM in the TIMIT phonetic recognition task.

Keywords :

computational complexity; gradient methods; hidden Markov models; parameter estimation; speech recognition; stochastic programming; TIMIT phonetic recognition task; computational complexity; constrained optimization problem; explicit phone segmentation; hidden Markov model; joint feature map; large margin discriminative semiMarkov model; local statistical dependency; nonprobabilistic discriminant function; parameter estimation problem; sequential data labeling; stochastic gradient descent algorithm; Acoustics; Hidden Markov models; Joints; Labeling; Probability; Speech; Speech recognition; Automatic speech recognition (ASR); large margin discriminative models; semi-Markov models; structured support vector machines;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2011.2108286

Filename :

5703116

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1437037