DocumentCode
2021065
Title
Monaural voiced speech segregation based on combined cues and energy distribution
Author
Zhao, Liheng ; Wang, Zengfu
Author_Institution
Dept. of Autom., Univ. of Sci. & Technol. of China, Hefei, China
fYear
2010
fDate
23-25 Nov. 2010
Firstpage
57
Lastpage
63
Abstract
Monaural speech segregation is important for speech signal processing, and it has been extensively studied on the basis of auditory scene analysis principles. However, current segregation algorithms can not achieve satisfactory performance in high frequency range. In this paper, we propose a system for monaural voiced speech segregation, in which two novel ideas are investigated. First, combined cues (including cross-channel correlation, temporal continuity, and onset/offset) are employed to generate segments in high frequency range. Second, the energy distribution of mixed signal is employed to indicate the reliabilities of cues in high frequency range, according to which, an alternative segmentation strategy is performed. Systematic evaluation and comparison show that the proposed system produces improvement on SNR gain.
Keywords
speech processing; SNR gain; auditory scene analysis; cues distribution; energy distribution; monaural voiced speech segregation algorithm; speech signal processing; systematic evaluation; Correlation; Erbium; Signal to noise ratio; Speech; Speech processing; Wideband;
fLanguage
English
Publisher
ieee
Conference_Titel
Audio Language and Image Processing (ICALIP), 2010 International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-4244-5856-1
Type
conf
DOI
10.1109/ICALIP.2010.5685014
Filename
5685014
Link To Document