مرکز منطقه ای اطلاع رساني علوم و فناوري - Usable speech detection using a context dependent Gaussian mixture model classifier

DocumentCode :

3379406

Title :

Usable speech detection using a context dependent Gaussian mixture model classifier

Author :

Yantorno, Robert E. ; Smolenski, Brett Y ; Iyer, Ananth N. ; Shah, Jashmin K.

Author_Institution :

Dept. of ECE, Temple Univ., Philadelphia, PA, USA

Volume :

fYear :

2004

fDate :

23-26 May 2004

Abstract :

Speech that is corrupted by nonstationary interference, but contains segments that are still usable for applications such as speaker identification or speech recognition, is referred to as "usable" speech. A common example of nonstationary interference occurs when there is more than one person talking at the same time, which is known as co-channel speech. In general the above speech processing applications do not work in co-channel environments; however, they can work on the extracted usable segments. Unfortunately, currently available usable speech measures only detect about 75% of the total available usable speech. The first reason for this high error stems from the fact that no single feature can accurately identify all the usable speech characteristics. This situation can be resolved by using a Gaussian mixture model (GMM) based classifier to combine several usable speech features. A second source of error stems from the fact that the current usable speech measures treat each frame of co-channel data independently of the decisions made on adjacent frames. The situation can be resolved when a hidden Markov model (HMM) is used to incorporate any context dependent information in adjacent frames. Using this approach we were able to obtain 84% reduction of usable speech with a 16% false alarm rate.

Keywords :

Gaussian processes; hidden Markov models; speech processing; speech recognition; Gaussian mixture model classifier; co-channel speech; context dependence; context dependent information; corrupted speech; error source; extracted usable segments; false alarm rate; hidden Markov model; independent frame treatment; nonstationary interference; speaker identification; speech processing applications; speech recognition; usable speech detection; usable speech features; usable speech identification; usable speech reduction; usable speech segments; Audio recording; Context modeling; Current measurement; Hidden Markov models; Interference; Predictive models; Signal restoration; Speech enhancement; Speech processing; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Circuits and Systems, 2004. ISCAS '04. Proceedings of the 2004 International Symposium on

Print_ISBN :

0-7803-8251-X

Type :

conf

DOI :

10.1109/ISCAS.2004.1329884

Filename :

1329884

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3379406