مرکز منطقه ای اطلاع رساني علوم و فناوري - Phoneme-based vector quantization in a discrete HMM speech recognizer

DocumentCode :

1288306

Title :

Phoneme-based vector quantization in a discrete HMM speech recognizer

Author :

Zhang, Yaxin ; Togneri, Roberto ; Alder, Michael

Author_Institution :

Motorola Australian Res. Centre, Bontany, NSW, Australia

Volume :

Issue :

fYear :

1997

fDate :

1/1/1997 12:00:00 AM

Firstpage :

Lastpage :

Abstract :

The quantization distortion of vector quantization (VQ) is a key element that affects the performance of a discrete hidden Markov modeling (DHMM) system. Many researchers have realized this problem and tried to use integrated feature or multiple codebook in their systems to offset the disadvantage of the conventional VQ. However the computational complexity of those systems is then increased. Investigations have shown that the speech signal space consists of finite clusters that represent phoneme data sets from male and female speakers and reveal Gaussian distributions. We propose an alternative VQ method in which the phoneme is treated as a cluster in the speech space and a Gaussian model is estimated for each phoneme. A Gaussian mixture model (GMM) is generated by the expectation-maximization (EM) algorithm for the whole speech space and used as a codebook in which each code word is a Gaussian model and represents a certain cluster. An input utterance would be classified as a certain phoneme or a set of phonemes only when the phoneme or phonemes gave highest likelihood. A typical discrete HMM system was used for both phoneme and isolated word recognition. The results show that the phoneme-based Gaussian modeling vector quantization classifies the speech space more effectively and significant improvements in the performance of the DHMM system have been achieved

Keywords :

Gaussian distribution; hidden Markov models; speech coding; speech processing; speech recognition; vector quantisation; Gaussian distributions; Gaussian mixture model; Gaussian model; VQ; code word; computational complexity; discrete HMM speech recognizer; discrete HMM system; discrete hidden Markov modeling; expectation-maximization algorithm; female speakers; input utterance; integrated feature; isolated word recognition; male speakers; multiple codebook; performance; phoneme based vector quantization; phoneme data sets; phoneme recognition; quantization distortion; speech signal space; Associate members; Australia; Clustering algorithms; Computational complexity; Computational efficiency; Gaussian distribution; Hidden Markov models; Partitioning algorithms; Speech recognition; Vector quantization;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/89.554266

Filename :

554266

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1288306