Title :
Phonetic feature extraction by time-sequence binary classifiers
Author :
Li, Jianmin ; Fang, Ditang
Author_Institution :
Dept. of Comput. Sci., Tsinghua Univ., Beijing, China
Abstract :
In this paper, we present time-sequence binary classifiers (TSBC) for phonetic feature extraction. A large neural network is divided into an array of TSBCs, and each TSBC is a multilayer neural network which is dedicatedly trained to extract low level acoustic features of only one phoneme category, resulting in lower neural network complexity. TSBC has a feature that its output units are sequentially arranged and trained to reflect the temporal information of phonetic features, which is very important in speech recognition. In our speaker-independent all-Chinese-Syllable continuous speech recognition system, TSBCs are efficiently combined with HMM techniques, where TSBCs are used to extract low level phonetic features and HMMs are used to recognize high level speech units. The evaluation experiments obtain 97.0% word accuracy for speaker-independent large-vocabulary and continuous speech recognition
Keywords :
computational complexity; feature extraction; hidden Markov models; multilayer perceptrons; pattern classification; speech recognition; HMM techniques; TSBC; hidden Markov models; low-level acoustic feature extraction; multilayer neural network; phonetic feature extraction; speaker-independent all-Chinese-syllable continuous speech recognition system; temporal information; time-sequence binary classifiers; Acoustic arrays; Artificial intelligence; Computer networks; Computer science; Data mining; Feature extraction; Hidden Markov models; Multi-layer neural network; Neural networks; Speech recognition;
Conference_Titel :
Neural Networks, 1995. Proceedings., IEEE International Conference on
Conference_Location :
Perth, WA
Print_ISBN :
0-7803-2768-3
DOI :
10.1109/ICNN.1995.488193