مرکز منطقه ای اطلاع رساني علوم و فناوري - Robust Acoustic Speech Feature Prediction From Noisy Mel-Frequency Cepstral Coefficients

DocumentCode :

1466858

Title :

Robust Acoustic Speech Feature Prediction From Noisy Mel-Frequency Cepstral Coefficients

Author :

Milner, B. ; Darch, Jonathan

Author_Institution :

Sch. of Comput. Sci., Univ. of East Anglia, Norwich, UK

Volume :

Issue :

fYear :

2011

Firstpage :

338

Lastpage :

347

Abstract :

This paper examines the effect of applying noise compensation to acoustic speech feature prediction from noisy mel-frequency cepstral coefficient (MFCC) vectors within a distributed speech recognition architecture. An acoustic speech feature (comprising fundamental frequency, formant frequencies, speech/nonspeech classification, and voicing classification) is predicted from an MFCC vector in a maximum a posteriori (MAP) framework using phoneme-specific or global models of speech. The effect of noise is considered and three different noise compensation methods, that have been successful in robust speech recognition, are integrated within the MAP framework. Experiments show that noise compensation can be applied successfully to prediction with best performance given by a model adaptation method that performs only slightly worse than matched training and testing. Further experiments consider application of the predicted acoustic features to speech reconstruction. A series of human listening tests show that the predicted features are sufficient for speech reconstruction and that noise compensation improves speech quality in noisy conditions.

Keywords :

acoustic signal processing; cepstral analysis; maximum likelihood estimation; signal classification; signal reconstruction; speech recognition; MAP framework; MFCC vector; acoustic features; acoustic speech feature prediction; distributed speech recognition architecture; formant frequency; fundamental frequency; maximum a posteriori; noise compensation; noisy mel-frequency cepstral coefficient; nonspeech classification; phoneme-specific speech; speech quality; speech reconstruction; voicing classification; Acoustic noise; Acoustic testing; Adaptation model; Cepstral analysis; Mel frequency cepstral coefficient; Noise robustness; Performance evaluation; Predictive models; Speech enhancement; Speech recognition; Fundamental frequency; distributed speech recognition; formants; maximum a posteriori (MAP); noise compensation;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2010.2047811

Filename :

5445018

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1466858