مرکز منطقه ای اطلاع رساني علوم و فناوري - Unsupervised Adaptation With Discriminative Mapping Transforms

DocumentCode :

1239167

Title :

Unsupervised Adaptation With Discriminative Mapping Transforms

Author :

Yu, Kai ; Gales, Mark ; Woodland, Philip C.

Author_Institution :

Eng. Dept., Cambridge Univ., Cambridge

Volume :

Issue :

fYear :

2009

fDate :

5/1/2009 12:00:00 AM

Firstpage :

714

Lastpage :

723

Abstract :

The most commonly used approaches to speaker adaptation are based on linear transforms, as these can be robustly estimated using limited adaptation data. Although significant gains can be obtained using discriminative criteria for training acoustic models, maximum-likelihood (ML) estimated transforms are still used for unsupervised adaptation. This is because discriminatively trained transforms are highly sensitive to errors in the adaptation supervision hypothesis. This paper describes a new framework for estimating transforms that are discriminative in nature, but are less sensitive to this hypothesis issue. A speaker-independent discriminative mapping transformation (DMT) is estimated during training. This transform is obtained after a speaker-specific ML-estimated transform of each training speaker has been applied. During recognition an ML speaker-specific transform is found for each test-set speaker and the speaker-independent DMT then applied. This allows a transform which is discriminative in nature to be indirectly estimated, while only requiring an ML speaker-specific transform to be found during recognition. The DMT technique is evaluated on an English conversational telephone speech task. Experiments showed that using DMT in unsupervised adaptation led to significant gains over both standard ML and discriminatively trained transforms.

Keywords :

maximum likelihood estimation; speaker recognition; transforms; English conversational telephone speech task; acoustic model training; adaptation supervision hypothesis; linear transforms; maximum-likelihood estimation transform; speaker adaptation; speaker recognition; speaker-independent discriminative mapping transformation; unsupervised adaptation; Digital audio broadcasting; Loudspeakers; Maximum likelihood estimation; OFDM modulation; Robustness; Speech analysis; Speech recognition; Target recognition; Telephony; Testing; Criterion mapping function; discriminative mapping transform; discriminative training; unsupervised adaptation;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2008.2011535

Filename :

4814782

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1239167