DocumentCode
1707856
Title
Correlation coefficient-based voice activity detector algorithm
Author
Craciun, A. ; Gabrea, M.
Author_Institution
Electr. Eng. Dept., Ecole de Technol. Super., Montreal, Que., Canada
Volume
3
fYear
2004
Firstpage
1789
Abstract
A voice activity detector (VAD) is an algorithm able to distinguish the speech regions from the background noise of the input signal and is an important step in many speech processing applications. The varying nature and the large variety of speech and background noise make this problem difficult especially for low signal to noise ratio (SNR) that is the case for many practical applications. In this paper we propose a new VAD algorithm designed to improve the solution of word boundary detection problem for variable background noise level in a real time application. The input signal is windowed in time domain and then the energy and the spectrum of the current frame are obtained. The first few frames are supposed not to contain speech and are used to obtain a first estimate of the noise parameters. These parameters are updated during the silence periods using a first order autoregressive filter. In order to obtain robust parameters that do not depend on the amplitude of the spectrum, the correlation coefficient of the instantaneous spectrum and an average of the background noise spectrum is calculated. The speech regions may be detected based on a statistical approach using a simple binary Markov model for speech activity process. To evaluate the performance of the proposed method a clean speech dataset from the TIMIT database corrupted with different types of noise from NOISEX database for different SNR levels has been utilized.
Keywords
Markov processes; acoustic noise; autoregressive processes; feature extraction; speech processing; speech recognition; statistical analysis; time-domain analysis; NOISEX database; SNR; VAD algorithm; average background noise spectrum; binary Markov model; clean speech dataset; correlation coefficient-based voice activity detector algorithm; current frame energy; current frame spectrum; first order autoregressive filter; input signal background noise; instantaneous spectrum correlation coefficient; low signal to noise ratio applications; noise corrupted TIMIT database; noise parameters estimate; noise variety; real time application; silence periods; speech activity process; speech processing applications; speech regions; speech variety; statistical signal processing; time domain windowed input signal; variable background noise level; voice activity detector; word boundary detection problem; Algorithm design and analysis; Background noise; Databases; Detectors; Filters; Noise level; Signal processing; Signal to noise ratio; Speech enhancement; Speech processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Electrical and Computer Engineering, 2004. Canadian Conference on
ISSN
0840-7789
Print_ISBN
0-7803-8253-6
Type
conf
DOI
10.1109/CCECE.2004.1349763
Filename
1349763
Link To Document