Title :
Phone classification using HMM/SVM system and normalization technique
Author :
Yakoub, Mohammed Sidi ; Nkambou, Roger ; Selouani, Sid-Ahmed
Author_Institution :
UQAM, Montreal, QC, Canada
Abstract :
Support vector machines (SVM) were originally developed for binary classification and extended for multi-class classification. Due to their powerfulness and adaptation to hard classification problems, we have chosen them for automatic speech recognition (ASR). The aim of this paper is to investigate the use of SVM multi-class classification coupled with HMM for TIMIT phones. SVM requires that all data samples for training and test to have the same features vector size. Due to the variability in length of phone signals even for the same phone, we have used a normalization technique: zero padding and resampling on all data samples to get them have features vector with the same size. After mapping the 61 TIMIT phones in 46 phones and conducting tests using LibSVM and HTK, we have obtained a classification accuracy rate of 91.26% with the hybrid HMM/SVM system and 71.41% with the HMM-based system. These results show that the hybrid HMM/SVM system using the normalization technique overcomes an HMM-based system and improves the recognition accuracy by 19.8%. Therefore, our experiments result encouraged us to use this hybrid system and normalization technique for the next work in the context of spoken dialogue system.
Keywords :
hidden Markov models; signal classification; speech recognition; support vector machines; ASR; HMM-SVM system; HTK; LibSVM; TIMIT phones; automatic speech recognition; binary classification; classification accuracy rate; features vector; hidden Markov model; multiclass classification; normalization technique; phone classification; phone signals; recognition accuracy; resampling; spoken dialogue system; support vector machines; zero padding; Equations; Feature extraction; Hidden Markov models; Kernel; Optimization; Support vector machines; Training; Automatic speech recognition; Hidden Markov Model; LibSVM; Mel Frequency Cepstral Coefficients; Support Vector Machines; multi-classification; normalization;
Conference_Titel :
Signal Processing and Information Technology(ISSPIT), 2013 IEEE International Symposium on
Conference_Location :
Athens
DOI :
10.1109/ISSPIT.2013.6781861