Title :
An enhanced training method for speech recognition in the VODIS project
Author :
Chau, E. ; Chung, Y.K. ; Frangoulis, E. ; Lucas, A.
Author_Institution :
ICL, Hong Kong
Abstract :
The authors report on a new training scheme for embedded training in connected speech recognition developed for the LOGOS II speech recognizer in the VODIS project. The technique is based on speech segmentation and k-means clustering, and it has been applied successfully to both linear predictive coding parameter templates and digital filterbank templates. Speech segmentation is achieved by applying Fisher´s discriminant algorithm to the time-warped distances between two minimally different connected-word utterances selected by the grammatical rules and the vocabulary of the particular application. The tokens within each word set are clustered using k-means, and the cluster centers in each set are chosen as reference tokens for the particular word. One to four cluster centers have been used in this investigation. The enhanced training scheme has been applied to connected digit and connected word recognition using the 170-word VODIS vocabulary. Recognition performance improvements were between 25% and 34% over training with isolated words
Keywords :
speech recognition; Fisher´s discriminant algorithm; LOGOS II speech recognizer; VODIS project; connected digit recognition; connected speech recognition; connected word recognition; connected-word utterances; digital filterbank templates; embedded training; grammatical rules; k-means clustering; linear predictive coding parameter templates; recognition performance; speech segmentation; time-warped distances; training method; vocabulary; Clustering algorithms; Databases; Fluctuations; Linear predictive coding; Partitioning algorithms; Robustness; Smoothing methods; Speech recognition; Testing; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on
Conference_Location :
Glasgow
DOI :
10.1109/ICASSP.1989.266503