Auditory-based robust speech recognition system for ambient assisted living in smart home

Author

Hsien-Shun Kuo ; Po-Hsun Sung ; Sheng-Chieh Lee ; Ta-Wen Kuan ; Jhing-Fa Wang

Author_Institution

Dept. of Electr. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan

fYear

2014

fDate

20-23 Sept. 2014

Firstpage

169

Lastpage

172

Abstract

An auditory-based feature extraction algorithm is proposed for enhancing the robustness of automatic speech recognition. In the proposed approach, the speech signal is characterized using a new feature referred to as the Basilar-membrane Frequency-band Cepstral Coefficient (BFCC). In contrast to the conventional Mel-Frequency Cepstral Coefficient (MFCC) method based on a Fourier spectrogram, the proposed BFCC method uses an auditory spectrogram based on a gammachirp wavelet transform in order to more accurately mimic the auditory response of the human ear and improve the noise immunity. In addition, a Hidden Markov Model (HMM) is used for both training and testing purposes. The evaluation results obtained using the AURORA 2 noisy speech database show that compared to the MFCC method, the proposed scheme improves the speech recognition rate by 15% on average given speech samples with Siganl-to-Noise Ratios (SNRs) ranging from 0 to 20 dB. Thus, the proposed method has significant potential for the development of robust speech recognition systems for ambient assisted living.

Keywords

assisted living; hidden Markov models; speech recognition; wavelet transforms; AURORA 2 noisy speech database; BFCC; Basilar-membrane frequency-band cepstral coefficient; HMM; MFCC; Mel-frequency cepstral coefficient method; SNR; ambient assisted living; auditory human ear response; auditory spectrogram; auditory-based feature extraction algorithm; auditory-based robust speech recognition system; automatic speech recognition; gammachirp wavelet transform; hidden Markov model; robust speech recognition systems; siganl-to-noise ratios; smart home; Mel frequency cepstral coefficient; Noise; Robustness; Speech; Speech recognition; Wavelet transforms; Ambient assisted living; auditory modeling; cepstral coefficients; gammachirp filterbank speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Orange Technologies (ICOT), 2014 IEEE International Conference on

Conference_Location

Xian

Type

conf

DOI

10.1109/ICOT.2014.6956626

Filename

6956626