مرکز منطقه ای اطلاع رساني علوم و فناوري - Speech Analysis in a Model of the Central Auditory System

DocumentCode :

1060137

Title :

Speech Analysis in a Model of the Central Auditory System

Author :

Woojay Jeon ; Juang, B.-H.

Author_Institution :

Motorola Lab., Schaumburg

Volume :

Issue :

fYear :

2007

Firstpage :

1802

Lastpage :

1817

Abstract :

Recently, there is a significant increase in research interest in the area of biologically inspired systems, which, in the context of speech communications, attempt to learn from human´s auditory perception and cognition capabilities so as to derive the knowledge and benefits currently unavailable in practice. One particular pursuit is to understand why the human auditory system generally performs with much more robustness than an engineering system, say a state-of-the-art automatic speech recognizer. In this study, we adopt a computational model of the mammalian central auditory system and develop a methodology to analyze and interpret its behavior for an enhanced understanding of its end product, which is a data-redundant, dimension-expanded representation of neural firing rates in the primary auditory cortex (A1). Our first approach is to reinterpret the well-known Mel-frequency cepstral coefficients (MFCCs) in the context of the auditory model. We then present a framework for interpreting the cortical response as a place-coding of speech information, and identify some key advantages of the model´s dimension expansion. The framework consists of a model of ldquosourcerdquo-invariance that predicts how speech information is encoded in a class-dependent manner, and a model of ldquoenvironmentrdquo-invariance that predicts the noise-robustness of class-dependent signal-respondent neurons. The validity of these ideas are experimentally assessed under existing recognition framework by selecting features that demonstrate their effects and applying them in a conventional phoneme classification task. The results are quantitatively and qualitatively discussed, and our insights inspire future research on category-dependent features and speech classification using the auditory model.

Keywords :

cepstral analysis; hearing; signal classification; speech coding; speech recognition; Mel-frequency cepstral coefficients; auditory cognition; auditory perception; automatic speech recognizer; central auditory system; neural firing rates; primary auditory cortex; speech analysis; speech classification; Auditory system; Biological information theory; Biological system modeling; Brain modeling; Cepstral analysis; Cognition; Context; Oral communication; Predictive models; Speech analysis; Auditory model; central auditory system; cortex; dimension expansion; noise robust; speech;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2007.900102

Filename :

4276755

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1060137