Title :
On the robust automatic segmentation of spontaneous speech
Author :
Petek, Bojan ; Andersen, Ove ; Dalsgaard, Paul
Abstract :
The results from applying an improved algorithm to the task of automatic segmentation of spontaneous telephone quality speech are presented, and compared to the results from those resulting from superimposing white noise. Three segmentation algorithms are compared which are all based on variants of the Spectral Variation Function. Experimental results are obtained on the OGI multi language telephone speech corpus (OGLTS). We show that the use of the auditory forward and backward masking effects prior to the SVF computation increases the robustness of the algorithm to white noise. When the average signal to noise ratio (SNR) is decreased to 10 dB, the peak ratio (defined as the ratio of the number of peaks measured at the target over the original SNRs) is increased by 16%, 12%, and 11% for the MFC (Mel Frequency Cepstra), RASTA (Relative Spectral Processing), and the FBDYN (Forward Backward Auditory Masking Dynamic Cepstra) SVF segmentation algorithms, respectively
Keywords :
spectral analysis; speech processing; white noise; FBDYN; Forward Backward Auditory Masking Dynamic Cepstra; MFC; Mel Frequency Cepstra; OGI multi language telephone speech corpus; RASTA; Relative Spectral Processing; SVF segmentation algorithms; Spectral Variation Function; auditory forward/backward masking effects; average signal to noise ratio; peak ratio; robust automatic segmentation; segmentation algorithms; spontaneous telephone quality speech; white noise; Acoustic noise; Attenuation; Cepstral analysis; Frequency; Niobium; Noise robustness; Signal to noise ratio; Speech enhancement; Speech recognition; White noise;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607750