• DocumentCode
    178045
  • Title

    Bayesian vocal tract model estimates of nasal stops for speaker verification

  • Author

    Enzinger, Ewald ; Kasess, Christian H.

  • Author_Institution
    Acoust. Res. Inst., Vienna, Austria
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    1685
  • Lastpage
    1689
  • Abstract
    In this paper we report on speaker verification experiments using branched vocal tract model estimates of alveolar nasal (/n/) stops. While the discriminatory potential of nasal acoustics has long been established, their acoustic properties have so far mostly been characterized using spectral features. Here, we used a Bayesian estimation technique to obtain reflection coefficients of a branched-tube model of the combined nasal and oral tract. Parameters were then modeled using probabilistic linear discriminant analysis to calculate likelihood ratios for speaker verification trials. Performance was assessed on normal and high vocal effort speech using high-quality and mobile-telephone-transmitted recordings taken from the German-language Pool2010 corpus. Results are compared with those of systems based on mel-frequency cepstral coefficients (MFCC). Vocal tract parameter based systems outperform MFCC based systems in matched conditions, but lack robustness under mismatch, while being readily interpretable with respect to a physical speech production model.
  • Keywords
    speaker recognition; Bayesian estimation technique; Bayesian vocal tract model estimates; German-language Pool2010 corpus; MFCC based systems; alveolar nasal; branched vocal tract model; branched-tube model; discriminatory potential; mel-frequency cepstral coefficients; mobile-telephone-transmitted recordings; nasal acoustics; nasal stops; physical speech production model; probabilistic linear discriminant analysis; reflection coefficients; speaker verification; speaker verification trials; spectral features; vocal tract parameter based systems; Bayes methods; Electron tubes; Estimation; Mel frequency cepstral coefficient; Speaker recognition; Speech; Bayesian estimation; Nasals; likelihood ratio; speaker verification; vocal tract modeling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6853885
  • Filename
    6853885