DocumentCode :
296522
Title :
Consonant characterization using correlation fractal dimension for speech recognition
Author :
Langi, A. ; Kinsner, W.
Author_Institution :
Dept. of Electr. & Comput. Eng., Manitoba Univ., Winnipeg, Man., Canada
Volume :
1
fYear :
1995
fDate :
15-16 May 1995
Firstpage :
208
Abstract :
This paper presents a new and promising approach in characterizing speech consonants using a fractal model for speech recognition systems. Characterization of consonants has been a difficult problem because consonant waveforms may be indistinguishable in time or frequency domain. The approach views consonant waveforms as coming from a turbulent constriction (excitation) in a human speech production system, and thus exhibiting turbulent and noise like time domain appearance. However, it departs from the usual approach by modeling consonant excitation using chaotic dynamical systems capable of generating turbulent and noise-like excitations. The scheme employs correlation fractal dimension and Takens embedding theorem to measure fractal dimension from time-series observation of the dynamical systems. It uses linear predictive coding (LPC) excitation of twenty-two consonant waveforms as the time series. Furthermore, the correlation fractal dimension is calculated using a fast Grassberger algorithm. A preliminary observation shows encouraging results because every consonant results in a unique trend of fractal dimensions for different embedding dimensions and scales
Keywords :
chaos; correlation theory; fractals; linear predictive coding; noise; speech coding; speech recognition; time series; time-domain analysis; Takens embedding theorem; chaotic dynamical systems; consonant characterization; consonant excitation; consonant waveforms; correlation fractal dimension; fast Grassberger algorithm; fractal model; frequency domain; human speech production system; linear predictive coding; noise; speech recognition; time domain; time-series observation; turbulent constriction; Chaos; Data compression; Fractals; Frequency domain analysis; Humans; Linear predictive coding; Noise generators; Production systems; Speech recognition; Time domain analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
WESCANEX 95. Communications, Power, and Computing. Conference Proceedings., IEEE
Conference_Location :
Winnipeg, Man.
Print_ISBN :
0-7803-2725-X
Type :
conf
DOI :
10.1109/WESCAN.1995.493972
Filename :
493972
Link To Document :
بازگشت