مرکز منطقه ای اطلاع رساني علوم و فناوري - Supervised Single-Microphone Multi-Talker Speech Separation with Conditional Random Fields

DocumentCode :

3605852

Title :

Supervised Single-Microphone Multi-Talker Speech Separation with Conditional Random Fields

Author :

Yu Ting Yeung ; Tan Lee ; Cheung-Chi Leung

Author_Institution :

Stanley Ho Big Data Decision Analytics Res. Centre, Chinese Univ. of Hong Kong, Hong Kong, China

Volume :

Issue :

fYear :

2015

Firstpage :

2334

Lastpage :

2342

Abstract :

We apply conditional random field (CRF) for single-microphone speech separation in a supervised learning scenario. We train the parameters with mixture data in which the sources are competing with the same average signal power. Compared with factorial hidden Markov model (HMM) baselines, the CRF settings require fewer training mixture data to improve objective speech quality measures and speech recognition accuracy of the reconstructed sources, when mixing ratios of training and testing mixture data are matched. The CRF settings also handle minor mixing ratio mismatch after adjusting the gain factors of the sources with non-linear mappings inspired from the mixture-maximization model. When the mixing ratio mismatch further increases such that the speech mixture is dominated by only one source, factorial HMM finally catches up with and performs better than the CRF settings due to improved model accuracy. We also develop a convex statistical inference simplification based on linear-chain CRFs. The simplification achieves the same performance level as the original CRF settings after integrating additional observations.

Keywords :

hidden Markov models; microphones; speech recognition; CRF; HMM; conditional random fields; hidden Markov model; mixture data; objective speech quality; speech recognition; statistical inference simplification; supervised learning; supervised single microphone multitalker speech separation; Hidden Markov models; Mel frequency cepstral coefficient; Parameter estimation; Speech processing; Training; Conditional random fields (CRFs); single-microphone speech separation; statistical model-based methods;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE/ACM Transactions on

Publisher :

ieee

ISSN :

2329-9290

Type :

jour

DOI :

10.1109/TASLP.2015.2479039

Filename :

7268898

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3605852