Title :
Single Frequency Filtering Approach for Discriminating Speech and Nonspeech
Author :
Aneeja, G. ; Yegnanarayana, B.
Author_Institution :
Int. Inst. of Inf. Technol., Hyderabad, India
Abstract :
In this paper, a signal processing approach is proposed for speech/nonspeech discrimination. The approach is based on single frequency filtering (SFF), where the amplitude envelope of the signal is obtained at each frequency with high temporal and spectral resolution. This high resolution property helps to exploit the resulting high signal-to-noise ratio (SNR) regions in time and frequency. The variance of the spectral information across frequency is higher for speech and lower for many types of noises. The mean and variance of the noise-compensated weighted envelopes are computed across frequency at each time instant. Decision logic is applied to the feature derived from the mean and variance values on varieties of degradations, including NTIMIT, CTIMIT and distance speech, besides degradation due to standard noise types. In all cases, the proposed method gives significantly better performance than the standard Adaptive Multi-rate VAD2 (AMR2) method. AMR2 method is chosen for comparison, as the method adapts itself for different degradations, and is seen to give good performance over different SNR situations. The proposed method does not use training data to derive the characteristics of speech or noise, nor makes any assumption on the nonspeech beginning. The SFF method appears promising in other applications of speech processing, such as pitch extraction and speech enhancement.
Keywords :
decision theory; filtering theory; signal resolution; speech processing; AMR2 method; CTIMIT; NTIMIT; SFF; SNR regions; adaptive multirate VAD2 method; decision logic; distance speech; high signal-to-noise ratio; high temporal resolution; noise-compensated weighted envelopes; nonspeech discrimination; pitch extraction; signal amplitude envelope; signal processing approach; single frequency filtering approach; spectral information; spectral resolution; speech discrimination; speech enhancement; speech processing; Databases; Degradation; Signal to noise ratio; Speech; Speech processing; Time-frequency analysis; Single frequency filtering (SFF); spectral variance; speech/nonspeech discrimination; temporal variance; voice activity detection (VAD); weighted component envelope;
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
DOI :
10.1109/TASLP.2015.2404035