Title :
Enhancing automatic speech recognition with an ultrasonic lip motion detector
Author :
Jennings, David L. ; Ruck, Dennis W.
Author_Institution :
Dept. of Electr. & Comput. Eng., Air Force Inst. of Technol., Wright-Patterson AFB, OH, USA
Abstract :
This paper presents the results of experimentation with a simple ultrasonic lip motion detector or “Ultrasonic Mike” in automatic speech recognition. The device is tested in a speaker dependent isolated word recognition task with a vocabulary consisting of the spoken digits from zero to nine. The “Ultrasonic Mike” is used as input to an automatic lip reader. The automatic lip reader uses template matching and dynamic time warping to determine the best candidate for a given test utterance. The device is first tested as a stand alone automatic lip reader achieving accuracy as high as 89%. Next the automatic lip reader is combined with a conventional automatic speech recognizer. Classifier fusion is based on a pseudo probability mass function derived from the dynamic time warping distances. The combined system is tested with various levels of acoustic noise added. In a typical example, at 0 dB, the acoustic recognizer´s accuracy was 78%, the lip reader accuracy was at 69%, but the combined accuracy was 93%. This experiment demonstrates that this simple ultrasonic lip motion detector, that has an output data rate 12500 times less than a typical video camera, can improve automatic speech recognition in noisy environments. This experiment also demonstrates an effective classifier fusion algorithm based on dynamic time warping distances
Keywords :
acoustic signal processing; nonelectric sensing devices; speech recognition; ultrasonic transducers; Ultrasonic Mike; acoustic noise; acoustic recognizer accuracy; automatic lip reader; automatic speech recognition; automatic speech recognizer; classifier fusion algorithm; dynamic time warping; dynamic time warping distances; experiment; lip reader accuracy; noisy environments; output data rate; pseudo probability mass function; speaker dependent isolated word recognition; spoken digits; template matching; test utterance; ultrasonic lip motion detector; vocabulary; Acoustic noise; Acoustic signal detection; Acoustic testing; Automatic speech recognition; Automatic testing; Cameras; Detectors; Motion detection; System testing; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location :
Detroit, MI
Print_ISBN :
0-7803-2431-5
DOI :
10.1109/ICASSP.1995.479832