Title :
Noise robust speech activity detection
Author :
Abdulla, Waleed H. ; Guan, Zhou ; Sou, Hou Chi
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Auckland, Auckland, New Zealand
Abstract :
An efficient noise robust feature is presented to track the speech activity in noisy environments. Speech is modeled by one class of 16 phone-like Gaussian mixtures while noises are modeled by 15 classes of 6 mixtures each. The feature vector used is a concatenation of carefully selected coefficients from MFCC, LPCC, and their first and second derivatives. A finite state machine and energy validation components are proposed as post-processor for the GMM classifier to rectify the misclassified speech segments. The demonstrated speech activity detection system based on our feature detects reliably both speech and non-speech segments. The designed frame work has been benchmarked against the commercially available codecs G.729, GSM-EFR, MR1, and MR2. Results show the proposed technique outperforms all these commonly used techniques under various SNR levels and in different noisy environments.
Keywords :
Gaussian processes; cepstral analysis; finite state machines; speech recognition; G.729 codec; GSM-EFR; Gaussian mixtures; LPCC; MFCC; feature vector; finite state machine; linear prediction cepstral cofficients; mel-frequency cepstral cofficients; noise robust feature; robust speech activity detection; Computer vision; Error analysis; Feature extraction; Mel frequency cepstral coefficient; Noise robustness; Signal processing; Spatial databases; Speech enhancement; Testing; Working environment noise; speech detection in noise; voice activity detection;
Conference_Titel :
Signal Processing and Information Technology (ISSPIT), 2009 IEEE International Symposium on
Conference_Location :
Ajman
Print_ISBN :
978-1-4244-5949-0
DOI :
10.1109/ISSPIT.2009.5407509