Acoustic modeling using transform-based phone-cluster adaptive training

Author

Manohar, Vimitha ; Srinivas, C. Bhargav ; Umesh, S.

Author_Institution

Dept. of Electr. Eng., Indian Inst. of Technol. Madras, Chennai, India

fYear

2013

fDate

8-12 Dec. 2013

Firstpage

49

Lastpage

54

Abstract

In this paper, we propose a new acoustic modeling technique called the Phone-Cluster Adaptive Training. In this approach, the parameters of context-dependent states are obtained by the linear interpolation of several monophone cluster models, which are themselves obtained by adaptation using linear transformation of a canonical Gaussian Mixture Model (GMM). This approach is inspired from the Cluster Adaptive Training (CAT) for speaker adaptation and the Subspace Gaussian Mixture Model (SGMM). The parameters of the model are updated in an adaptive training framework. The interpolation vectors implicitly capture the phonetic context information. The proposed approach shows substantial improvement over the Continuous Density Hidden Markov Model (CDHMM) and a similar performance to that of the SGMM, while using significantly fewer parameters than both the CDHMM and the SGMM.

Keywords

Gaussian processes; interpolation; mixture models; speaker recognition; Gaussian mixture model; acoustic modeling technique; context-dependent states; continuous density hidden Markov model; interpolation vectors; linear interpolation; linear transformation; monophone cluster models; speaker adaptation; transform-based phone-cluster adaptive training; Adaptation models; Context modeling; Hidden Markov models; Interpolation; Training; Transforms; Vectors; Acoustic Modeling; Phone-Cluster Adaptive Training; Subspace Gaussian Mixture Models;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on

Conference_Location

Olomouc

Type

conf

DOI

10.1109/ASRU.2013.6707704

Filename

6707704