DocumentCode
2854568
Title
Auditory-based acoustic distinctive features and spectral cues for automatic speech recognition using a multi-stream paradigm
Author
Tolba, Hesham ; Selouani, Sid-Ahmed ; O´Shaughnessy, Douglas
Author_Institution
INRS-Télécommunications, Université du Québec, 900 de la Gauchetière Ouest, H5A 1C6, Canada
Volume
1
fYear
2002
fDate
13-17 May 2002
Abstract
In this paper, a multi-stream paradigm is proposed to improve the performance of automatic speech recognition (ASR) systems. Our goal in this paper is to improve the performance of the HMM-based ASR systems by exploiting some features that characterize speech sounds based on the auditory system and one based on the Fourier power spectrum. It was found that combining the classical MFCCs with some auditory-based acoustic distinctive cues and the main peaks of the spectrum of a speech signal using a multi-stream paradigm leads to an improvement in the recognition performance. The Hidden Markov Model Toolkit (HTK) was used throughout our experiments to test the use of the new multi-stream feature vector. A series of experiments on speaker-independent continuous-speech recognition have been carried out using a subset of the large read-speech corpus TIMIT. Using such multi-stream paradigm, N-mixture mono-/tri-phone models and a bigram language model, we found that the word error rate was decreased by about 4.01%.
Keywords
Analytical models; Cepstrum; Computational modeling; Ear; Hidden Markov models; Markov processes; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location
Orlando, FL, USA
ISSN
1520-6149
Print_ISBN
0-7803-7402-9
Type
conf
DOI
10.1109/ICASSP.2002.5743869
Filename
5743869
Link To Document