DocumentCode
323483
Title
A novel feature-extraction for speech recognition based on multiple acoustic-feature planes
Author
Nitta, Tsuneo
Author_Institution
Multimedia Eng. Lab., Toshiba Corp., Kawasaki, Japan
Volume
1
fYear
1998
fDate
12-15 May 1998
Firstpage
29
Abstract
This paper describes an attempt to incorporate the functions of the auditory nerve system into the feature extractor of speech recognition. The functions include four types of well-known responses to sound stimuli: the local peaks of the steady sound spectrum, ascending FM sound, descending FM sound, and sharply rising and falling sound. Each function is realized in the form of a three-level derivative operator and is applied to a time-spectrum (TS) pattern X(t,f) of the output of the BPF with 26-channels. The resultant acoustic cue of an input speech represented by multiple acoustic-feature planes (MAFP) is compressed by using the Karhuenen-Loeve transform (KLT), then classified. In the experiments performed on a Japanese E-set (12 consonantal parts of /Ci/) extracted from continuous speech, the MAFP significantly improved the error rate from 34.5% and 29.6% obtained by X(t,f) and X(t,f)+ΔtX(t,f) to 17.0% for unknown speakers (dimension=64)
Keywords
acoustic signal processing; feature extraction; frequency modulation; hearing; pattern classification; spectral analysis; speech processing; speech recognition; transforms; Japanese E-set; KLT; acoustic cue; ascending FM sound; auditory nerve system; continuous speech; descending FM sound; error rate; experiments; feature-extraction; input speech; local peaks; multiple acoustic-feature planes; sharply falling sound; sharply rising sound; sound stimuli; speech recognition; steady sound; three-level derivative operator; time-spectrum pattern; Acoustical engineering; Band pass filters; Electronic mail; Error analysis; Feature extraction; Karhunen-Loeve transforms; Laboratories; Multimedia systems; Radio frequency; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location
Seattle, WA
ISSN
1520-6149
Print_ISBN
0-7803-4428-6
Type
conf
DOI
10.1109/ICASSP.1998.674359
Filename
674359
Link To Document