A training procedure for isolated word recognition systems

Author

Furui, Sadaoki

Author_Institution

Nippon Telegraph and Telephone Public Corporation, Tokyo, Japan

Volume

28

Issue

2

fYear

1980

fDate

4/1/1980 12:00:00 AM

Firstpage

129

Lastpage

136

Abstract

A procedure has been devised to reduce the amount of training required for a phoneme-based speaker-dependent word recognition system and still maintain performance. Each new speaker is required to provide utterances of only a fraction of the entire vocabulary as a training set. A set of transformation rules is used to estimate phoneme templates for the entire vocabulary from phoneme templates included in the training. The transformation rules are obtained in a pretraining procedure in which a group of speakers provides utterances of the entire vocabulary and multiple regression analysis (MRA) is used to obtain linear estimates of the entire phoneme template set in terms of the set designated as training templates. This group of speakers is generally distinct from the group of training speakers. Thus, since the transformation rules are established independent of the training speakers, the entire procedure can be considered a hybrid speaker-dependent/ speaker-independent system. Results of recognition experiments using spoken digits uttered by 30 male and female speakers and 67 airport names uttered by 30 male speakers have ascertained the effectiveness of this training procedure. A mean recognition accuracy of 98.2 percent was obtained for the latter utterance set after a 12-word training procedure.

Keywords

Acoustic measurements; Airports; Automatic speech recognition; Regression analysis; Speech processing; Speech recognition; System testing; Telegraphy; Telephony; Vocabulary;

fLanguage

English

Journal_Title

Acoustics, Speech and Signal Processing, IEEE Transactions on

Publisher

ieee

ISSN

0096-3518

Type

jour

DOI

10.1109/TASSP.1980.1163393

Filename

1163393