Evaluation of linear regression for speaker adaptation in HMM-based articulatory movements estimation

Author

Hao Li ; Jianhua Tao ; Yang Wang

Author_Institution

Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China

fYear

2015

fDate

19-24 April 2015

Firstpage

4944

Lastpage

4948

Abstract

Acoustic-to-articulatory inversion problem is usually studied in speaker-specific manner because both articulatory data and acoustic features contain speaker-specific components. This paper presents our work on speaker-adaptation training for this problem. We implement speaker adaptation in HMM-based acoustic-to-articulatory inversion mapping, and evaluate different combinatorial structures of the articulatory data and acoustic features. The HMM-based inversion mapping models are built with single-stream and multistream, independent clustering and shared clustering structures. The speaker adaptation is implemented in stream-independent structure and shared adaptation structure. The constrained maximum likelihood linear regression method is used for the speaker-adaptive transformation. The experimental results show that the sharing of the speaker-adaptive transformation of the articulatory feature stream and acoustic feature stream can improve the estimation accuracy in inversion mapping. The multi-stream system with shared clustering and shared adaptive transformation has the best result among all the tested structures.

Keywords

acoustic signal processing; hidden Markov models; maximum likelihood estimation; pattern clustering; regression analysis; speaker recognition; HMM-based acoustic-to-articulatory inversion mapping; HMM-based articulatory movements estimation; acoustic feature stream; acoustic-to-articulatory inversion problem; articulatory data; articulatory feature stream; constrained maximum likelihood linear regression method; independent clustering structure; linear regression evaluation; shared adaptation structure; shared clustering structure; speaker-adaptation training; speaker-adaptive transformation; speaker-specific components; stream-independent structure; Acoustics; Adaptation models; Correlation; Hidden Markov models; Integrated circuits; Speech; Training; acoustic-to-articulatory inversion; maximum likelihood linear regression; speaker adaptation;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location

South Brisbane, QLD

Type

conf

DOI

10.1109/ICASSP.2015.7178911

Filename

7178911