• DocumentCode
    1914491
  • Title

    A text-independent speaker recognition method robust against utterance variations

  • Author

    Matsui, Tomoko ; Furui, Sadaoki

  • Author_Institution
    NTT Human Interface Lab., Tokyo, Japan
  • fYear
    1991
  • fDate
    14-17 Apr 1991
  • Firstpage
    377
  • Abstract
    The authors describe a VQ (vector-quantization)-based text-independent speaker recognition method which is robust against utterance variations. Three techniques are introduced to cope with temporal and text-dependent spectral variations. First, either an ergodic hidden Markov model or a voiced/unvoiced decision is used to classify input speech into broad phonetic classes. Second, a new distance measure, the distortion-intersection measure (DIM), is introduced for calculating VQ distortion of input speech compared to speaker-independent codebooks. Third, a normalization method, talker variability normalization (TVN), is introduced. TVN normalizes parameter variation taking both inter- and intra-speaker variability into consideration. The system was tested using utterances of nine speakers recorded over three years. The combination of the three techniques achieves high speaker identification accuracies of 98.5% using only vocal tract information and 99.0% using both vocal tract and pitch information
  • Keywords
    Markov processes; data compression; encoding; speech recognition; DIM; VQ distortion; broad phonetic classes; distance measure; distortion-intersection measure; ergodic hidden Markov model; input speech classification; inter-speaker variability; intra-speaker variability; pitch information; speaker identification accuracies; speech recognition; talker variability normalization; temporal variations; text-dependent spectral variations; text-independent speaker recognition; utterance variations; vector-quantization; vocal tract information; voiced/unvoiced decision; Distortion measurement; Hidden Markov models; Humans; Laboratories; Parameter estimation; Robustness; Speaker recognition; Speech recognition; System testing; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference on
  • Conference_Location
    Toronto, Ont.
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-0003-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1991.150355
  • Filename
    150355