مرکز منطقه ای اطلاع رساني علوم و فناوري - Scaled factorial hidden Markov models: A new technique for compensating gain differences in model-based single channel speech separation

DocumentCode :

2795173

Title :

Scaled factorial hidden Markov models: A new technique for compensating gain differences in model-based single channel speech separation

Author :

Radfar, M.H. ; Wong, W. ; Dansereau, R.M. ; Chan, W.-Y.

Author_Institution :

Dept. of Electr. & Comput. Eng., Univ. of Toronto, Toronto, ON, Canada

fYear :

2010

fDate :

14-19 March 2010

Firstpage :

1918

Lastpage :

1921

Abstract :

In model-based single channel speech separation, factorial hidden Markov models (FHMM) have been successfully applied to model the mixture signal Y(t) = X(t) + V(t) in terms of trained patterns of the speech signals X(t) and V(t). Nonetheless, when the test signals are scaled versions of the trained patterns (i.e. g_xX(t) and g_vV(t)), the performance of FHMM degrades significantly. In this paper, we introduce a modification to FHMM, called scaled FHMM, which compensates gain difference. In this technique, first, the scale factors are expressed in terms of the target-to-interference ratio (TIR). Then, an iteration quadratic optimization approach is coupled with FHMM to estimate TIR which with the decoded HMM sequences maximize the likelihood of the mixture signal. Experimental results, conducted on 180 mixtures with TIRs from 0 to 15 dB, show that the proposed technique significantly outperforms unscaled FHMM, and scaled/unscaled vector quantization speech separation techniques.

Keywords :

hidden Markov models; iterative methods; optimisation; speech processing; FHMM; TIR; decoded HMM sequences; iteration quadratic optimization approach; mixture signal likelihood maximization; model-based single-channel speech separation; scaled factorial hidden Markov models; speech signals; target-to-interference ratio; trained patterns; vector quantization speech separation techniques; Degradation; Hidden Markov models; Interference; Iterative decoding; Source separation; Speech recognition; Systems engineering and theory; Testing; Vector quantization; Viterbi algorithm; factorial hidden Markov models; mixmax approximation; model-based single channel speech separation; quadratic optimization; source separation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location :

Dallas, TX

ISSN :

1520-6149

Print_ISBN :

978-1-4244-4295-9

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2010.5495323

Filename :

5495323

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2795173