Title :
Real-Time Robust Automatic Speech Recognition Using Compact Support Vector Machines
Author :
Solera-Ureña, Rubén ; García-Moral, Ana Isabel ; Peláez-Moreno, Carmen ; Martínez-Ramón, Manel ; Díaz-de-María, Fernando
Author_Institution :
Dept. of Signal Theor. & Commun., Univ. Carlos III de Madrid, Leganes, Spain
fDate :
5/1/2012 12:00:00 AM
Abstract :
In the last years, support vector machines (SVMs) have shown excellent performance in many applications, especially in the presence of noise. In particular, SVMs offer several advantages over artificial neural networks (ANNs) that have attracted the attention of the speech processing community. Nevertheless, their high computational requirements prevent them from being used in practice in automatic speech recognition (ASR), where ANNs have proven to be successful. The high complexity of SVMs in this context arises from the use of huge speech training databases with millions of samples and highly overlapped classes. This paper suggests the use of a weighted least squares (WLS) training procedure that facilitates the possibility of imposing a compact semiparametric model on the SVM, which results in a dramatic complexity reduction. Such a complexity reduction with respect to conventional SVMs, which is between two and three orders of magnitude, allows the proposed hybrid WLS-SVC/HMM system to perform real-time speech decoding on a connected-digit recognition task (SpeechDat Spanish database). The experimental evaluation of the proposed system shows encouraging performance levels in clean and noisy conditions, although further improvements are required to reach the maturity level of current context-dependent HMM-based recognizers.
Keywords :
neural nets; speech coding; speech recognition; support vector machines; SpeechDat Spanish database; artificial neural networks; connected-digit recognition task; real-time robust automatic speech recognition; real-time speech decoding; speech processing community; support vector machines; weighted least squares training procedure; Artificial neural networks; Hidden Markov models; Real-time systems; Speech; Speech recognition; Support vector machines; Training; Additive noise; SVM/HMM; artificial neural network (ANN)/hidden Markov model (HMM); compact support vector machine (SVM); hybrid automatic speech recognition (ASR); machine learning; real-time ASR; robust ASR;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2011.2178597