DocumentCode
3019221
Title
An efficient speaker-independent automatic speech recognition by simulation of some properties of human auditory perception
Author
Hermansky, Hynek
Author_Institution
Speech Technology Laboratory, Santa Barbara, California
Volume
12
fYear
1987
fDate
31868
Firstpage
1159
Lastpage
1162
Abstract
An auditory model of speech perception, the Perceptually based linear predictive analysis with Root power sum metric (PLP-RPS), is applied as the front-end of an automatic speech recognizer (ASR). The PLP-RPS front-end is compared with standard linear predictive-cepstral metric (LP-CEP) front-end, and with LP-RPS and PLP-CEP front-ends. The two-spectral-peak models are the most efficient in modeling of linguistic information in speech. Consequently, in speaker-independent ASR, high analysis order front-ends are less effective than low-order front-ends. Synthetic speech is used for front-end evaluation. Some of perceptual inconsistencies of standard LP front-ends are alleviated in PLP front-ends. The PLP-RPS front-end is most sensitive to harmonic structure of speech spectrum. Perceptual experiments indicate similar tendencies in human auditory perception.
Keywords
Auditory system; Automatic speech recognition; Humans; Laboratories; Natural languages; Power harmonic filters; Predictive models; Speech analysis; Testing; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '87.
Type
conf
DOI
10.1109/ICASSP.1987.1169803
Filename
1169803
Link To Document