Title :
Acoustic modeling for speech recognition based on spotting of phonetic units
Author_Institution :
Xerox Palo Alto Res. Center, CA, USA
Abstract :
A two stage approach to acoustic modeling for speech recognition is presented. First, a set of phoneme-like units are spotted in the continuous speech stream, then the outputs of the spotters are modeled as the observations generated by an HMM. The motivation is that this model allows overlap and/or gaps in the acoustic realizations of the phonetic units upon which recognition is based. This is in contrast to conventional HMM-based approaches, which model continuous speech as a concatenation of models for the individual phonetic units, and which therefore assume that the acoustic realizations also concatenate. We argue that phenomena such as coarticulation can be viewed as temporal overlap of the acoustic realizations of adjacent phonetic units. Some TIMIT-phone recognition experiments are presented, in which the new model has an error rate approximately 8% higher than a conventional context-independent HMM-based recognizer using standard CEP-based features. Some possible improvements to the model are discussed, and experiments are continuing
Keywords :
acoustic signal processing; hidden Markov models; speech recognition; HMM; TIMIT-phone recognition experiments; acoustic modeling; acoustic realizations; adjacent phonetic units; coarticulation; continuous speech stream; error rate; phonetic units spotting; speech recognition; temporal overlap; Context modeling; Error analysis; Hidden Markov models; Machinery; Mathematical model; Neural networks; Speech recognition; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location :
Adelaide, SA
Print_ISBN :
0-7803-1775-0
DOI :
10.1109/ICASSP.1994.389362