Title :
Robust front-end processing for speaker identification over extremely degraded communication channels
Author :
Sadjadi, Seyed Omid ; Hansen, John H. L.
Author_Institution :
Center for Robust Speech Syst. (CRSS), Univ. of Texas at Dallas, Richardson, TX, USA
Abstract :
Effective front-end processing, which often involves feature extraction and speech activity detection (SAD), is essential for robustness in speech systems. In this study, we propose an unsupervised SAD scheme based on four different speech voicing measures which are combined with a perceptual spectral flux feature. Effectiveness of the proposed scheme is evaluated and compared against several commonly adopted unsupervised SAD methods under actual harsh acoustic conditions. As an example application, we also evaluate performance of the proposed SAD in the context of an i-vector based speaker identification (SID) system, where the recently introduced mean Hilbert envelope coefficients (MHEC) are benchmarked against conventional MFCCs. Long and spontaneous conversational audio recordings from DARPA program RATS (Phase-I) are used in our evaluations. Experimental results indicate that the proposed SAD solution is highly effective and provides superior performance compared to other unsupervised SAD techniques considered. In addition, it is shown that MHECs are effective alternatives to MFCCs for SID tasks under severe degraded channel conditions.
Keywords :
feature extraction; speaker recognition; speech processing; DARPA program RATS; MFCC; MHEC; SAD solution; SID system; SID tasks; channel conditions; communication channels; feature extraction; harsh acoustic conditions; i-vector based speaker identification system; mean Hilbert envelope coefficients; perceptual spectral flux feature; robust front end processing; speech activity detection; speech systems; speech voicing measures; spontaneous conversational audio recordings; unsupervised SAD methods; unsupervised SAD scheme; unsupervised SAD techniques; Feature extraction; Mel frequency cepstral coefficient; Noise; Noise measurement; Rats; Robustness; Speech; Mean Hilbert Envelope Coefficients (MHEC); speaker identification (SID); spectral flux; speech activity detection (SAD); voicing measures;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639063