Title :
Voice activity detection based on density ratio estimation and system combination
Author :
Tachioka, Yuuki ; Hanazawa, Toshiyuki ; Narita, T. ; Ishii, Jun
Author_Institution :
Inf. Technol. R & D Center, Mitsubishi Electr. Corp., Kanagawa, Japan
fDate :
Oct. 29 2013-Nov. 1 2013
Abstract :
We propose a robust voice activity detection (VAD) based on density ratio estimation. In highly noisy environments, the likelihood ratio test (LRT) is effective. Conventional LRT estimates both speech and noise models, calculates the likelihood of each model, and uses ratios of such likelihood to detect speech. However, in LRT, the likelihood ratio of speech and noise models is required, whereas likelihood of individual models is not necessarily required. The framework of the density ratio estimation models likelihood ratio functions by a kernel and directly generates a likelihood ratio. Applying density ratio estimation to VAD requires that feature selection and noise adaptation must be considered. This is because the density ratio estimation constrains the shape of the likelihood ratio functions and speech is dynamic. This paper addresses these problems. To improve accuracy, the proposed method is combined with conventional LRT. Experimental results using CENSREC-1-C show that the proposed method is more effective than conventional methods, especially in non-stationary noisy environments.
Keywords :
speech processing; statistical analysis; CENSREC-1-C; LRT; VAD; density ratio estimation models likelihood ratio functions; feature selection; likelihood ratio test; noise adaptation; robust voice activity detection; speech detection; system combination; Accuracy; Adaptation models; Estimation; Kernel; Noise; Noise measurement; Speech;
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific
Conference_Location :
Kaohsiung
DOI :
10.1109/APSIPA.2013.6694118