• DocumentCode
    3191570
  • Title

    Dynamic speaker clustering algorithm based on minimal GMM distance tracing

  • Author

    He, Jun ; He, Qian-Hua ; Wang, Zhi-Feng ; Li, Yan-Xiong ; Luo, Hai-Yu

  • Author_Institution
    Sch. of Electron. & Inf. Eng., South China Univ. of Technol., Guangzhou, China
  • fYear
    2011
  • fDate
    20-23 March 2011
  • Firstpage
    51
  • Lastpage
    56
  • Abstract
    In the field of speaker clustering, most of the clustering algorithms rely heavily on the pre-given thresholds, which is hard work to get the optimal values. This paper proposed a speaker clustering algorithm based on tracing the minimal Bhattacharyya distance between two Gaussian Mixture Models (GMMs), without any pre-given thresholds. In the procedure of clustering, if utterance set A and B has the minimal distance, utterance B is regarded as suspicious set whose utterance may come from the speaker of A. And then, two stage-verification is used. First, a comparative likelihood is used to verify whether the suspicious set B is generated from the speaker or not. Second, a comparative likelihood for each utterance in set B is used to judge whether it is produced by the speaker of set A or not. If the utterance is from the speaker of set A, we move the utterance of set B to set A. And then the models of utterance set A and B are updated. Repeat the above two stages until each speech set is not changed. Experiments, evaluated on Chinese 863 speech database, give 68.97% average cluster purity (ACP), and classification error ratio (CER) is 39%. On the other hand, CER of the K-means and the Iterative Self-Organizing Data Analysis (ISODATA) with the optimal thresholds give 35% and 38% respectively.
  • Keywords
    Gaussian processes; pattern clustering; speech processing; Gaussian mixture models; K-means; comparative likelihood; dynamic speaker clustering algorithm; minimal Bhattacharyya distance; minimal GMM distance tracing; pre-given thresholds; two stage-verification; Arrays; Clustering algorithms; Equations; Heuristic algorithms; Mathematical model; Speech; Training; Gaussian mixture model; dynamic likelihood; minimal distance tracing; speaker clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), 2011 IEEE International Conference on
  • Conference_Location
    Kunming
  • Print_ISBN
    978-1-61284-910-2
  • Type

    conf

  • DOI
    10.1109/CYBER.2011.6011763
  • Filename
    6011763