Title :
A hybrid neural network, dynamic programming word spotter
Author :
Zeppenfeld, Torsten ; Waibel, Alex H.
Author_Institution :
Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
A novel keyword-spotting system that combines both neural network and dynamic programming techniques is presented. This system makes use of the strengths of time delay neural networks (TDNNs), which include strong generalization ability, potential for parallel implementations, robustness to noise, and time shift invariant learning. Dynamic programming models are used by this system because they have the useful capability of time warping input speech patterns. This system was trained and tested on the Stonehenge Road Rally database, which is a 20-keyword-vocabulary, speaker-independent, continuous-speech corpus. Currently, this system performs at a figure of merit (FOM) rate of 82.5%. FOM is the detection rate averaged from 0 to 10 false alarms per keyword hour. This measure is explained in detail
Keywords :
dynamic programming; neural nets; speech recognition; Stonehenge Road Rally database; TDNN; continuous-speech corpus; detection rate; dynamic programming word spotter; figure of merit; input speech patterns; speaker independent speech recognition; time delay neural networks; time shift invariant learning; time warping; vocabulary; Computer science; Dictionaries; Distributed databases; Dynamic programming; Neural networks; Noise robustness; Speech enhancement; Speech recognition; System testing; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on
Conference_Location :
San Francisco, CA
Print_ISBN :
0-7803-0532-9
DOI :
10.1109/ICASSP.1992.226116