Title :
Using a Model of the Cochlea Based in the Micro and Macro Mechanical to Find Parameters for Automatic Speech Recognition
Author :
Oropeza Rodriguez, Jose Luis ; Reyes Saldana, Jose Francisco
Author_Institution :
Dept. of Digital Signal Process., Nat. Polytech. Inst., Mexico City, Mexico
Abstract :
Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). That is because this important organ of the hearing in the mammalians is the principal element used to make a transduction of the sound pressure that is received by the ear. In this paper we show how the macro and micro mechanical model is used in ASR tasks. We used the values that Neely founded in his work, related with the macro and micro mechanical model, such as was named, to set the central frequencies of a bank filter to obtain parameters from the speech used in a similar form as MFCC were constructed. We propose a new approach that considers a new form to construct the bank filter in our parametric representation. Then we used this distribution of the bank filter to have a new representation of the speech in frequency domain. It is important indicate that MFCC parameters use Mel scale to create a bank filter where central frequencies of each filter is in function of the scale mentioned above. We used the response of the Neely´s model behavior to create the central frequencies of the bank filter mentioned above, then we substitute the Mel scale function by another representation. We use the place theory, and we reach a 98.5% of performance, for a task that uses isolated digits pronounced by 5 different speakers. Neely´s model was used because a set of parameters of the cochlea as mass, damping and stiffness, among others, when are substituted inside the model make the response obtained is closer than von Békésy proposed in his preliminary work about principle function of the cochlea.
Keywords :
channel bank filters; damping; ear; elastic constants; signal representation; speech recognition; ASR tasks; MFCC parameters; Mel scale function; Neely model; automatic speech recognition; bank filter; central frequencies; cochlea behavior model; damping; frequency domain; hearing organ; macromechanical model; mammalians; mass; micromechanical model; parametric representation; sound pressure transduction; speech representation; stiffness; Filter banks; Hair; Hidden Markov models; Mathematical model; Mel frequency cepstral coefficient; Physiology; Speech; Speech recognition; cochlea operation; place theory and bank filter component;
Conference_Titel :
Artificial Intelligence (MICAI), 2013 12th Mexican International Conference on
Conference_Location :
Mexico City
Print_ISBN :
978-1-4799-2604-6
DOI :
10.1109/MICAI.2013.39