• DocumentCode
    153614
  • Title

    Auditory-based robust speech recognition system for ambient assisted living in smart home

  • Author

    Hsien-Shun Kuo ; Po-Hsun Sung ; Sheng-Chieh Lee ; Ta-Wen Kuan ; Jhing-Fa Wang

  • Author_Institution
    Dept. of Electr. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
  • fYear
    2014
  • fDate
    20-23 Sept. 2014
  • Firstpage
    169
  • Lastpage
    172
  • Abstract
    An auditory-based feature extraction algorithm is proposed for enhancing the robustness of automatic speech recognition. In the proposed approach, the speech signal is characterized using a new feature referred to as the Basilar-membrane Frequency-band Cepstral Coefficient (BFCC). In contrast to the conventional Mel-Frequency Cepstral Coefficient (MFCC) method based on a Fourier spectrogram, the proposed BFCC method uses an auditory spectrogram based on a gammachirp wavelet transform in order to more accurately mimic the auditory response of the human ear and improve the noise immunity. In addition, a Hidden Markov Model (HMM) is used for both training and testing purposes. The evaluation results obtained using the AURORA 2 noisy speech database show that compared to the MFCC method, the proposed scheme improves the speech recognition rate by 15% on average given speech samples with Siganl-to-Noise Ratios (SNRs) ranging from 0 to 20 dB. Thus, the proposed method has significant potential for the development of robust speech recognition systems for ambient assisted living.
  • Keywords
    assisted living; hidden Markov models; speech recognition; wavelet transforms; AURORA 2 noisy speech database; BFCC; Basilar-membrane frequency-band cepstral coefficient; HMM; MFCC; Mel-frequency cepstral coefficient method; SNR; ambient assisted living; auditory human ear response; auditory spectrogram; auditory-based feature extraction algorithm; auditory-based robust speech recognition system; automatic speech recognition; gammachirp wavelet transform; hidden Markov model; robust speech recognition systems; siganl-to-noise ratios; smart home; Mel frequency cepstral coefficient; Noise; Robustness; Speech; Speech recognition; Wavelet transforms; Ambient assisted living; auditory modeling; cepstral coefficients; gammachirp filterbank speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Orange Technologies (ICOT), 2014 IEEE International Conference on
  • Conference_Location
    Xian
  • Type

    conf

  • DOI
    10.1109/ICOT.2014.6956626
  • Filename
    6956626