DocumentCode :
1437037
Title :
Large Margin Discriminative Semi-Markov Model for Phonetic Recognition
Author :
Kim, Sungwoong ; Yun, Sungrack ; Yoo, Chang D.
Author_Institution :
Dept. of Electr. Eng., Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea
Volume :
19
Issue :
7
fYear :
2011
Firstpage :
1999
Lastpage :
2012
Abstract :
This paper considers a large margin discriminative semi-Markov model (LMSMM) for phonetic recognition. The hidden Markov model (HMM) framework that is often used for phonetic recognition assumes only local statistical dependencies between adjacent observations, and it is used to predict a label for each observation without explicit phone segmentation. On the other hand, the semi-Markov model (SMM) framework allows simultaneous segmentation and labeling of sequential data based on a segment-based Markovian structure that assumes statistical dependencies among all the observations within a phone segment. For phonetic recognition which is inherently a joint segmentation and labeling problem, the SMM framework has the potential to perform better than the HMM framework at the expense of slight increase in computational complexity. The SMM framework considered in this paper is based on a non-probabilistic discriminant function that is linear in the joint feature map which attempts to capture long-range statistical dependencies among observations. The parameters of the discriminant function are estimated by a large margin learning framework for structured prediction. The parameter estimation problem in hand leads to an optimization problem with many margin constraints, and this constrained optimization problem is solved using a stochastic gradient descent algorithm. The proposed LMSMM outperformed the large margin discriminative HMM in the TIMIT phonetic recognition task.
Keywords :
computational complexity; gradient methods; hidden Markov models; parameter estimation; speech recognition; stochastic programming; TIMIT phonetic recognition task; computational complexity; constrained optimization problem; explicit phone segmentation; hidden Markov model; joint feature map; large margin discriminative semiMarkov model; local statistical dependency; nonprobabilistic discriminant function; parameter estimation problem; sequential data labeling; stochastic gradient descent algorithm; Acoustics; Hidden Markov models; Joints; Labeling; Probability; Speech; Speech recognition; Automatic speech recognition (ASR); large margin discriminative models; semi-Markov models; structured support vector machines;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2011.2108286
Filename :
5703116
Link To Document :
بازگشت