DocumentCode
32340
Title
Estimating Speaker Height and Subglottal Resonances Using MFCCs and GMMs
Author
Arsikere, Harish ; Lulich, Steven M. ; Alwan, Abeer
Author_Institution
Electr. Eng. Dept., Univ. of California, Los Angeles, Los Angeles, CA, USA
Volume
21
Issue
2
fYear
2014
fDate
Feb. 2014
Firstpage
159
Lastpage
162
Abstract
This letter investigates the use of MFCCs and GMMs for 1) improving the state of the art in speaker height estimation, and 2) rapid estimation of subglottal resonances (SGRs) without relying on formant and pitch tracking (unlike our previous algorithm in [1]). The proposed system comprises a set of height-dependent GMMs modeling static and dynamic MFCC features, where each GMM is associated with a height value. Furthermore, since SGRs and height are correlated, each GMM is also associated with a set of SGR values (known a priori). Given a speech sample, speaker height and SGRs are estimated as weighted combinations of the values corresponding to the N most-likely GMMs. We assess the importance of using dynamic MFCC features and the weighted decision rule, and demonstrate the efficacy of our approach via experiments on height estimation (using TIMIT) and SGR estimation (using the Tracheal Resonance database.
Keywords
Gaussian processes; mixture models; speaker recognition; Gaussian mixture model; Mel-frequency cepstral coefficient; TIMIT; dynamic MFCC feature; height dependent GMM; speaker height estimation; subglottal resonance; tracheal resonance database; Correlation; Databases; Estimation; Mel frequency cepstral coefficient; Signal processing algorithms; Speech; Training; GMMs; MFCCs; rapid estimation; speaker height; subglottal resonances;
fLanguage
English
Journal_Title
Signal Processing Letters, IEEE
Publisher
ieee
ISSN
1070-9908
Type
jour
DOI
10.1109/LSP.2013.2295397
Filename
6689290
Link To Document