• DocumentCode
    2323915
  • Title

    An optimal Bhattacharyya centroid algorithm for Gaussian clustering with applications in automatic speech recognition

  • Author

    Rigazio, Luca ; Tsakam, Brice ; Junqua, Jean-Claude

  • Author_Institution
    Speech Technol. Lab., Panasonic Technols Inc., Santa Barbara, CA, USA
  • Volume
    3
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    1599
  • Abstract
    The problem of clustering Gaussian distributions can be effectively solved by standard vector quantization algorithms where the metric is defined by the Bhattacharyya distance. This paper presents a novel algorithm for computing the optimal centroid for a cluster of Gaussian distributions according to the Bhattacharyya metric. We show that this centroid maximizes an upper bound on the probability of representing the population modeled by the distributions associated with the cluster. The proposed method is evaluated in clustering distributions of hidden Markov model speech recognizers to reduce the overall memory consumption and runtime complexity of the decoding. Experimental results show that, depending on the task, the number of distributions can be reduced by a factor of 2 to 6 with an increase in recognition accuracy. When compared to a maximum likelihood centroid, the Bhattacharyya centroid provides a 13% error rate reduction in a 2k word recognition task
  • Keywords
    Gaussian distribution; computational complexity; decoding; hidden Markov models; optimisation; speech coding; speech recognition; vector quantisation; 2k word recognition task; Gaussian clustering; automatic speech recognition; clustering Gaussian distributions; clustering distributions; decoding; error rate reduction; hidden Markov model speech recognizers; memory consumption; optimal Bhattacharyya centroid algorithm; optimal centroid; probability; recognition accuracy; runtime complexity; standard vector quantization algorithms; upper bound; Clustering algorithms; Distributed computing; Gaussian distribution; Hidden Markov models; Maximum likelihood decoding; Runtime; Speech analysis; Speech recognition; Upper bound; Vector quantization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-6293-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2000.861998
  • Filename
    861998