Title :
On time alignment and metric algorithms for speech recognition
Author :
Yfantis, E.A. ; Lazarakis, T. ; Angelopoulos, A. ; Elison, J.D. ; Zhang, Y.
Author_Institution :
Dept. of Comput. Sci., Nevada Univ., Las Vegas, NV, USA
Abstract :
An algorithm for comparing speech waveforms to decide if the spoken utterance is part of a given vocabulary of word waveforms or not, and if it is part of the vocabulary, to choose the matching word is presented. Our algorithm has been implemented in connection with our own vector interpolation alignment algorithm which is faster than dynamic time warping and yet as accurate as dynamic time warping. This vector interpolation, has a classification rate comparable to that of dynamic time warping. While vector interpolation is able to match dynamic time warping for recognition accuracy, it requires significantly less computation, making it much faster than DTW based algorithms. Both algorithms are presented and a comparison of the two is made. Also an alternative algorithm, where the number of intervals of the two utterances to be compared is the same, where the length of the intervals in one utterance is different than the length of the intervals in the other utterance, has been investigated. When appropriate adjustments are made so that the beginning and end of the two utterances match, this algorithm has a classification rate comparable to that of dynamic time warping. Furthermore an alternative to LPC analysis for utterance recognition is presented. Unlike LPC which is an extrapolation algorithm, our algorithm is an interpolation algorithm. Theoretically our algorithm has smaller variance and smaller mean square error than the LPC algorithm. Preliminary results show that our algorithm provides high probability of correct classification
Keywords :
interpolation; pattern classification; probability; sequences; speech recognition; vectors; classification rate; dynamic time warping; interpolation algorithm; matching word; metric algorithms; recognition accuracy; speech waveforms; spoken utterance; time alignment; utterance recognition; vector interpolation alignment algorithm; word waveforms; Automatic speech recognition; Dynamic programming; Extrapolation; Information retrieval; Interpolation; Linear predictive coding; Mean square error methods; Speech processing; Speech recognition; Vocabulary;
Conference_Titel :
Information Intelligence and Systems, 1999. Proceedings. 1999 International Conference on
Conference_Location :
Bethesda, MD
Print_ISBN :
0-7695-0446-9
DOI :
10.1109/ICIIS.1999.810311