Fast speaker adaptation using triple diagonal and shared block diagonal transform matrices

Author

Ding, Guo-Hong ; Xu, Bo ; Iso-Sipilä, Juha ; Cao, Yang

Author_Institution

High-Tech Innovation Center, Acad. Sinica, Beijing, China

Volume

1

fYear

2003

fDate

6-10 April 2003

Abstract

This paper proposes two fast and effective adaptation algorithms, which are called SATD and SASBD respectively. The two algorithms are implemented in the MLLR frame and the transform matrices have constrained forms. SATD uses triple diagonal matrices to describe the mismatch between speakers and the acoustic model in the log-spectral domain and the matrices can be transformed into the cepstral domain to adjust the acoustic model. SASBD is different from the traditional block-diagonal MLLR and shares the three transformations of basic MFCC and dynamic features with one matrix. Moreover, both algorithms provide multiple choices for the biases. Experiments are extensively implemented and the results prove the advantages of SATD and SASBD over traditional MLLR.

Keywords

cepstral analysis; speaker recognition; MLLR frame; SASBD; SATD; acoustic model; cepstral domain dynamic features; log-spectral domain; mismatch; multiple choices; shared block diagonal transform matrices; speaker adaptation; transform matrices; triple diagonal matrices; Automation; Cepstral analysis; Laboratories; Loudspeakers; Maximum likelihood linear regression; Mel frequency cepstral coefficient; Parameter estimation; Research and development; Technological innovation; Tides;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-7663-3

Type

conf

DOI

10.1109/ICASSP.2003.1198777

Filename

1198777