Author :
Rabiner, Lawrence R. ; Wilpon, Jay G. ; Soong, Frank K.
Author_Institution :
AT&T Bell Lab., Murray Hill, NJ, USA
Abstract :
The authors use an enhanced analysis feature set consisting of both instantaneous and transitional spectral information and test the hidden-Markov-model (HMM)-based connected-digit recognizer in speaker-trained, multispeaker, and speaker-independent modes. For the evaluation, both a 50-talker connected-digit database recorded over local, dialed-up telephone lines, and the Texas Instruments, 225-adult-talker, connected-digits database are used. Using these databases, the performance achieved was 0.35, 1.65, and 1.75% string error rates for known-length strings, for speaker-trained, multispeaker, and speaker-independent modes, respectively, and 0.78, 2.85, and 2.94% string error rates for unknown-length strings of up to seven digits in length for the three modes. Several experiments were carried out to determine the best set of conditions (e.g., training, recognition, parameters, etc.) for recognition of digits. The results and the interpretation of these experiments are described
Keywords :
Markov processes; speech recognition; connected digit recognition; hidden Markov models; spectral information; speech recognition; strings; Cepstral analysis; Cepstrum; Distributed databases; Hidden Markov models; Information analysis; Pattern recognition; Spatial databases; Telephony; Testing; Vocabulary;