DocumentCode
730772
Title
Evaluation of linear regression for speaker adaptation in HMM-based articulatory movements estimation
Author
Hao Li ; Jianhua Tao ; Yang Wang
Author_Institution
Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
fYear
2015
fDate
19-24 April 2015
Firstpage
4944
Lastpage
4948
Abstract
Acoustic-to-articulatory inversion problem is usually studied in speaker-specific manner because both articulatory data and acoustic features contain speaker-specific components. This paper presents our work on speaker-adaptation training for this problem. We implement speaker adaptation in HMM-based acoustic-to-articulatory inversion mapping, and evaluate different combinatorial structures of the articulatory data and acoustic features. The HMM-based inversion mapping models are built with single-stream and multistream, independent clustering and shared clustering structures. The speaker adaptation is implemented in stream-independent structure and shared adaptation structure. The constrained maximum likelihood linear regression method is used for the speaker-adaptive transformation. The experimental results show that the sharing of the speaker-adaptive transformation of the articulatory feature stream and acoustic feature stream can improve the estimation accuracy in inversion mapping. The multi-stream system with shared clustering and shared adaptive transformation has the best result among all the tested structures.
Keywords
acoustic signal processing; hidden Markov models; maximum likelihood estimation; pattern clustering; regression analysis; speaker recognition; HMM-based acoustic-to-articulatory inversion mapping; HMM-based articulatory movements estimation; acoustic feature stream; acoustic-to-articulatory inversion problem; articulatory data; articulatory feature stream; constrained maximum likelihood linear regression method; independent clustering structure; linear regression evaluation; shared adaptation structure; shared clustering structure; speaker-adaptation training; speaker-adaptive transformation; speaker-specific components; stream-independent structure; Acoustics; Adaptation models; Correlation; Hidden Markov models; Integrated circuits; Speech; Training; acoustic-to-articulatory inversion; maximum likelihood linear regression; speaker adaptation;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178911
Filename
7178911
Link To Document