• DocumentCode
    2695443
  • Title

    Accommodating sample size effect on similarity measures in speaker clustering

  • Author

    Haubold, Alexander ; Kender, John R.

  • Author_Institution
    Dept. of Comput. Sci., Columbia Univ., New York, NY
  • fYear
    2008
  • fDate
    June 23 2008-April 26 2008
  • Firstpage
    1525
  • Lastpage
    1528
  • Abstract
    We investigate the symmetric Kullback-Leibler (KL2) distance in speaker clustering and its unreported effects for differently-sized feature matrices. Speaker data is represented as Mel frequency cepstral coefficient (MFCC) vectors, and features are compared using the KL2 metric to form clusters of speech segments for each speaker. We make two observations with respect to clustering based on KL2: 1.) The accuracy of clustering is strongly dependent on the absolute lengths of the speech segments and their extracted feature vectors. 2.) The accuracy of the similarity measure strongly degrades with the length of the shorter of the two speech segments. These effects of length can be attributed to the measure of covariance used in KL2. We demonstrate an empirical correction of this sample-size effect that increases clustering accuracy. We draw parallels to two vector quantization-based (VQ) similarity measures, one which exhibits an equivalent effect of sample size, and the second being less influenced by it.
  • Keywords
    pattern clustering; quantisation (signal); speaker recognition; Mel frequency cepstral coefficient; feature matrices; sample size effect; similarity measures; speaker clustering; speaker data; speech segments; symmetric Kullback-Leibler distance; vector quantization; Computer science; Degradation; Feature extraction; Image segmentation; Labeling; Length measurement; Mel frequency cepstral coefficient; Size measurement; Speech; Symmetric matrices;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo, 2008 IEEE International Conference on
  • Conference_Location
    Hannover
  • Print_ISBN
    978-1-4244-2570-9
  • Electronic_ISBN
    978-1-4244-2571-6
  • Type

    conf

  • DOI
    10.1109/ICME.2008.4607737
  • Filename
    4607737