Title :
A new algorithm for representing acoustic feature dynamics
Author :
Katagiri, Shigeru ; McDermott, Erik ; Yokota, Manami
Author_Institution :
ATR Auditory & Visual Perception Res. Lab., Osaka, Japan
Abstract :
The goal of this algorithm is to reduce learning time in a multireference phoneme-recognition system based on learning vector quantization (LVQ). The algorithm has, within the system, two kinds of vectors: a codebook vector and a reference node vector. An ordering procedure, similar to self-organizing feature maps, is used for the codebook design, and LVQ is used to adapt reference node vectors. The algorithm is divided into four steps: (1) codebook design, (2) mapping from the phoneme vector into the node vector, (3) reduction of the number of reference node vectors, and (4) adaptation of the reference node vectors. In particular, the mapping translates the high-dimensional phoneme vector into a trajectory on a two-dimensional plane; the acoustic feature dynamics can thus be visualized. Geometrical distance on the plane is used in the number reduction and adaptation of the reference node vectors. The learning time is thereby reduced considerably. Experiments using Japanese voiced plosives have shown that the algorithm can considerably speed up learning and still maintain a high, 97% recognition rate
Keywords :
speech recognition; Japanese voiced plosives; acoustic feature dynamics; algorithm; codebook vector; learning time; learning vector quantization; mapping; multireference phoneme-recognition system; ordering procedure; phoneme vector; reference node vector; speech recognition; Algorithm design and analysis; Delay effects; Laboratories; Poles and towers; Speech recognition; Trajectory; Vector quantization; Visual perception; Visualization; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on
Conference_Location :
Glasgow
DOI :
10.1109/ICASSP.1989.266430