Title :
Unvoiced Speech Segregation From Nonspeech Interference via CASA and Spectral Subtraction
Author :
Hu, Ke ; Wang, DeLiang
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
Abstract :
While a lot of effort has been made in computational auditory scene analysis to segregate voiced speech from monaural mixtures, unvoiced speech segregation has not received much attention. Unvoiced speech is highly susceptible to interference due to its relatively weak energy and lack of harmonic structure, and hence makes its segregation extremely difficult. This paper proposes a new approach to segregation of unvoiced speech from nonspeech interference. The proposed system first removes estimated voiced speech, and the periodic part of interference based on cross-channel correlation. The resultant interference becomes more stationary and we estimate the noise energy in unvoiced intervals using segregated speech in neighboring voiced intervals. Then unvoiced speech segregation occurs in two stages: segmentation and grouping. In segmentation, we apply spectral subtraction to generate time-frequency segments in unvoiced intervals. Unvoiced speech segments are subsequently grouped based on frequency characteristics of unvoiced speech using simple thresholding as well as Bayesian classification. The proposed algorithm is computationally efficient, and systematic evaluation and comparison show that our approach considerably improves the performance of unvoiced speech segregation.
Keywords :
Bayes methods; harmonics; interference (signal); speech processing; Bayesian classification; CASA; computational auditory scene analysis; cross-channel correlation; harmonic structure; monaural mixtures; nonspeech interference; spectral subtraction; unvoiced speech segregation; Feature extraction; Harmonic analysis; Interference; Noise; Power harmonic filters; Speech; Time frequency analysis; Bayesian classification; computational auditory scene analysis (CASA); nonspeech interference; spectral subtraction; unvoiced speech segregation;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2010.2093893