Title :
Deep neural support vector machines for speech recognition
Author :
Shi-Xiong Zhang ; Chaojun Liu ; Kaisheng Yao ; Yifan Gong
Author_Institution :
Microsoft Corp., Redmond, WA, USA
Abstract :
A new type of deep neural networks (DNNs) is presented in this paper. Traditional DNNs use the multinomial logistic regression (softmax activation) at the top layer for classification. The new DNN instead uses a support vector machine (SVM) at the top layer. Two training algorithms are proposed at the frame and sequence-level to learn parameters of SVM and DNN in the maximum-margin criteria. In the frame-level training, the new model is shown to be related to the multiclass SVM with DNN features; In the sequence-level training, it is related to the structured SVM with DNN features and HMM state transition features. Its decoding process is similar to the DNN-HMM hybrid system but with frame-level posterior probabilities replaced by scores from the SVM. We term the new model deep neural support vector machine (DNSVM). We have verified its effectiveness on the TIMIT task for continuous speech recognition.
Keywords :
hidden Markov models; neural nets; speech recognition; support vector machines; DNN; HMM state transition features; TIMIT task; continuous speech recognition; deep neural networks; deep neural support vector machines; frame-level posterior probability; frame-level training; maximum-margin criteria; multinomial logistic regression; sequence-level training; softmax activation; training algorithms; Hidden Markov models; Mathematical model; Neural networks; Speech; Speech recognition; Support vector machines; Training; DNN; maximum margin; multiclass SVM; sequence training; structured SVM;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178777