• DocumentCode
    2791163
  • Title

    An adaptive initialization method for speaker Diarization based on prosodic features

  • Author

    Imseng, David ; Friedland, Gerald

  • Author_Institution
    Idiap Res. Inst., Martigny, Switzerland
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    4946
  • Lastpage
    4949
  • Abstract
    The following article presents a novel, adaptive initialization scheme that can be applied to most state-of-the-art Speaker Diarization algorithms, i.e. algorithms that use agglomerative hierarchical clustering with Bayesian Information Criterion (BIC) and Gaussian Mixture Models (GMMs) of frame-based cepstral features (MFCCs). The initialization method is a combination of the recently proposed “adaptive seconds per Gaussian” (ASPG) method and a new pre-clustering and number of initial clusters estimation method based on prosodic features. The presented initialization method has two important advantages. First, the method requires no manual tuning and is robust against file length and speaker count variations. Second, the method outperforms our previously used initialization methods on all benchmark files that were presented in the 2006, 2007, and 2009 NIST Rich Transcription (RT) evaluations and results in a Diarization Error Rate (DER) improvement of up to 67% (relative).
  • Keywords
    Bayes methods; Gaussian processes; cepstral analysis; speaker recognition; ASPG method; BIC; Bayesian information criterion; GMM; Gaussian mixture model; MFCC; adaptive initialization method; adaptive seconds per Gaussian method; agglomerative hierarchical clustering; cluster estimation method; frame-based cepstral feature; prosodic feature; speaker diarization; Audio recording; Bayesian methods; Cepstral analysis; Clustering algorithms; Delay; Error analysis; Microphones; NIST; Robustness; Speech; Gaussian Mixture Models; Prosodic features; Speaker Diarization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495102
  • Filename
    5495102