Title :
Incorporating phonetic knowledge into a multi-stream HMM framework
Author :
Norouzian, Atta ; Selouani, Sid-Ahmed ; Tolba, Hesham ; Shaughnessy, Douglas O.
Author_Institution :
INRS-EMT, Univ. du Quebec, Montreal, QC
Abstract :
This paper presents a technique for improving the performance of multi-stream HMMs in ASR systems. In this technique stream exponents of the multi-stream model are chosen with respect to the phonological content of the underlying states. Two distinctive feature sets namely MFCCs and formant-like features are used for investigating the potential of this technique. The experiments are performed on the AURORA database under the distributed speech recognition (DSR) framework. The proposed front-end constitutes an alternative to the DSR-XAFE (XAFE : eXtended Audio Front-End) provided by European Telecommunications Standards Institute. It is shown that the results obtained from the proposed method leads to improvement up to 10% in word accuracy relative to the word accuracy obtained form the multi-stream model with tied exponents and up to 35% relative improvement in word accuracy over the state-of-the-art MFCC-based system.
Keywords :
cepstral analysis; hidden Markov models; speech recognition; ASR system; AURORA database; DSR framework; European Telecommunications Standards Institute; MFCC; automatic speech recognition; distributed speech recognition; formant-like feature; hidden Markov model; mel frequency cepstrum coefficient; multistream HMM framework; phonological content; Automatic speech recognition; Distributed databases; Frequency estimation; Hidden Markov models; Niobium; Resonant frequency; Spatial databases; Speech recognition; Telecommunication standards; Voting; Distributed Speech Recognition; formant frequencies; multi-stream paradigm;
Conference_Titel :
Electrical and Computer Engineering, 2008. CCECE 2008. Canadian Conference on
Conference_Location :
Niagara Falls, ON
Print_ISBN :
978-1-4244-1642-4
Electronic_ISBN :
0840-7789
DOI :
10.1109/CCECE.2008.4564834