DocumentCode :
3407043
Title :
Regularization in a reproducing kernel hubert space for robust voice activity detection
Author :
Lu, Xugang ; Unoki, Masashi ; Isotani, Ryosuke ; Kawai, Hisashi ; Nakamura, Satoshi
Author_Institution :
Nat. Inst. of Inf. & Commun. Technol., Japan
fYear :
2010
fDate :
24-28 Oct. 2010
Firstpage :
585
Lastpage :
588
Abstract :
Voice activity detection (VAD) is still a difficult task in noisy environments since the statistical distributions of speech and non-speech features are heavily overlapped in noisy environments. Considering that speech is a special type of acoustic signal that only occupies a small fraction of the whole acoustic space, we have proposed a new speech processing method for VAD by giving constraints on the processing space as a reproducing kernel Hilbert space (RKHS). In the RKHS, the estimation of the speech was regarded as a functional approximation problem. Via a regularization in the RKHS framework, a target function is learned to approximate the speech signal while the noise component is supposed to be smoothed out. In this framework, we could incorporate the nonlinear mapping functions in the approximation implicitly via a kernel function. The approximation function could capture the nonlinear and high-order statistical structure of the speech. Our VAD algorithm is designed on the basis of the power energy in this regularized RKHS. We have tested its performance on CENSREC-1-C data corpus for VAD task. In this paper, we quantified its performance on the discriminability for speech and non-speech, and further compared its performance with several classical VAD algorithms. Experimental results showed that the proposed processing for speech enhanced the discriminability between the distributions of speech and non-speech, and got better performance on the VAD task than the classical VAD algorithms.
Keywords :
Hilbert spaces; higher order statistics; speech enhancement; statistical distributions; CENSREC-1-C data corpus; RKHS; VAD algorithm; acoustic signal; functional approximation problem; high-order statistical structure; kernel function; noise component; nonlinear mapping functions; nonspeech features; reproducing kernel Hilbert space; robust voice activity detection; speech enhancement; speech features; speech processing method; statistical distributions; Approximation methods; Kernel; Noise measurement; Signal to noise ratio; Speech; Speech processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing (ICSP), 2010 IEEE 10th International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-5897-4
Type :
conf
DOI :
10.1109/ICOSP.2010.5656060
Filename :
5656060
Link To Document :
بازگشت