DocumentCode :
3012911
Title :
Speech recognition in scale space
Author :
Lyon, Richard F.
Author_Institution :
Schulmberger Palo Alto Research, Palo Alto, CA
Volume :
12
fYear :
1987
fDate :
31868
Firstpage :
1265
Lastpage :
1268
Abstract :
Scale-space filtering, proposed by Witkin (ICASSP 84) for describing natural structure in one-dimensional signals, has been extended for application to segmentation and description of vector-valued functions of time, such as speech spectrograms. By analyzing the rate of change of a vector trajectory at many different scales of time-smoothing, a tree of natural segments can be constructed. At various levels in the tree (i.e., at various scales), these segments are found to agree well with the kind of linguistically and perceptually important segments that spectrogram readers use to describe sound patterns of speech. Scale-space segmentations of cochleagrams (spectrograms based on a computational model of the peripheral auditory system) have been experimentally applied to word recognition. Recognition using fixed-scale segmentations with finite-state word models and a Viterbi search has led to speaker-independent digit recognition accuracies of greater than 97%, about the same as in tests with non-segmented cochleagrams. More complex recognition algorithms that use the segmentation tree are being developed, and scale-space experiments with connected digits and sentences are also underway.
Keywords :
Auditory system; Dynamic programming; Filtering; Pattern matching; Signal processing algorithms; Spectrogram; Speech recognition; Testing; Vector quantization; Viterbi algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '87.
Type :
conf
DOI :
10.1109/ICASSP.1987.1169448
Filename :
1169448
Link To Document :
بازگشت