Title :
Trace-segmentation of isolated utterances for speech recognition
Author :
Cabral, Euvaldo F., Jr. ; Tattersall, Graham D.
Author_Institution :
Lab. of Commun. & Signals, Sao Paulo Univ., Brazil
Abstract :
Trace-segmentation (sometimes called variable frame rate coding) is a method for nonlinear time-normalization of a sequence of speech representation frames prior to recognition of the sequence. Numerous attempts to perform speech recognition using trace-segmentation have been made in the past but these attempts have failed to provide the same performance as DTW or HMM recognition. The reason for this failure may be due to the use of inappropriate distance metrics to perform the segmentation or the use of an inappropriate spatial sampling interval along the trace. This paper describes an investigation into these problems, in which the appropriate Nyquist sample rate of the spatial trace is determined by analyzing the frequency of the temporal variation of the speech frames. It is also shown that separate segmentation of the trajectory described by each individual coefficient in the speech frame leads to much improved recognition which exceeds the performance provided by DTW recognition of the same database
Keywords :
signal representation; signal sampling; speech coding; speech recognition; DTW; DTW recognition; HMM recognition; Nyquist sample rate; database; distance metrics; frequency analysis; isolated utterances; nonlinear time-normalization; performance; spatial sampling interval; speech frame coefficient; speech recognition; speech representation frames; temporal variation; trace-segmentation; trajectory; variable frame rate coding; Artificial neural networks; Databases; Delay effects; Feeds; Frequency; Hidden Markov models; Laboratories; Neural networks; Sampling methods; Shape; Speech analysis; Speech recognition;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location :
Detroit, MI
Print_ISBN :
0-7803-2431-5
DOI :
10.1109/ICASSP.1995.479597