Title :
Comparison of DTW and HMM for isolated word recognition
Author :
Sajjan, Sharada C. ; Vijaya, C.
Author_Institution :
Dept. of Electron. & Commun. Eng., SDM Coll. of Eng. & Technol., Dharwad, India
Abstract :
This study proposes limited vocabulary isolated word recognition using Linear Predictive Coding(LPC) and Mel Frequency Cepstral Coefficients(MFCC) for feature extraction, Dynamic Time Warping(DTW) and discrete Hidden Markov Model (HMM) for recognition and their comparisons. Feature extraction is carried over the speech frame of 300 samples with 100 samples overlap at 8 KHz sampling rate of the input speech. MFCC analysis provides better recognition rate than LPC as it operates on a logarithmic scale which resembles human auditory system whereas LPC has uniform resolution over the frequency plane. This is followed by pattern recognition. Since the voice signal tends to have different temporal rate, DTW is one of the methods that provide non-linear alignment between two voice signals. Another method called HMM that statistically models the words is also presented. Experimentally it is observed that recognition accuracy is better for HMM compared with DTW. The database used is TI-46 isolated word corpus zero-nine from Linguist Data Consortium.
Keywords :
cepstral analysis; feature extraction; hidden Markov models; linear predictive coding; signal sampling; speech coding; speech recognition; statistical analysis; vocabulary; word processing; DTW; HMM; Linguist Data Consortium; MFCC analysis; Mel frequency cepstral coefficients; TI-46 isolated word corpus zero-nine database; discrete hidden Markov model; dynamic time warping; feature extraction; frequency 8 GHz; frequency plane; human auditory system; input speech; limited vocabulary isolated word recognition; linear predictive coding; logarithmic scale; pattern recognition; recognition accuracy; sampling rate; speech frame; statistical modelling; uniform resolution; voice signal; Accuracy; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Speech; Speech recognition; Vectors; Dynamic Time Warping (DTW); Hidden Markov Model (HMM); Linear Predictive Coding(LPC); Mel Frequency Cepstral Coefficients (MFCC);
Conference_Titel :
Pattern Recognition, Informatics and Medical Engineering (PRIME), 2012 International Conference on
Conference_Location :
Salem, Tamilnadu
Print_ISBN :
978-1-4673-1037-6
DOI :
10.1109/ICPRIME.2012.6208391