Speech description through MINERS: Model Invariant to Noise and Environment Robust for Speech

Author

Viana, H.O. ; Mello, C.A.B.

Author_Institution

Centro de Inf., Univ. Fed. de Pernambuco, Recife, Brazil

fYear

2014

fDate

5-8 Oct. 2014

Firstpage

489

Lastpage

494

Abstract

This paper presents a new speech descriptor called MINERS. It uses image analysis, Mel-Frequency Cepstrum Coefficients (MFCC), and a combination of Wavelet denoising and Power Normalized Cepstral Coefficients (PNCC) Temporal Masking. The purpose of this descriptor is to make easier feature extraction for speech. The descriptor is invariant to noise, environment, and speaker. For evaluation, NOIZEUS database was used for speech recognition through two classifiers: Support Vector Machine (SVM) and Hidden Markov Model (HMM). MINERS has better results among all other evaluated descriptors. The most successful approach was obtained using MINERS with a SVM classifier.

Keywords

cepstral analysis; feature extraction; hidden Markov models; image classification; image denoising; speech recognition; support vector machines; HMM; MFCC; MINERS; NOIZEUS database; PNCC temporal masking; SVM classifier; feature extraction; hidden Markov model; image analysis; mel-frequency cepstrum coefficients; noise; power normalized cepstral coefficients; speech description; speech recognition; support vector machine; wavelet denoising; Hidden Markov models; Mel frequency cepstral coefficient; Noise measurement; Noise reduction; Signal to noise ratio; Speech; Image Analysis; MFCC; PNCC; Speech Descriptors; Speech Recognition; Wavelets;

fLanguage

English

Publisher

ieee

Conference_Titel

Systems, Man and Cybernetics (SMC), 2014 IEEE International Conference on

Conference_Location

San Diego, CA

Type

conf

DOI

10.1109/SMC.2014.6973955

Filename

6973955