• DocumentCode
    840488
  • Title

    On the use of instantaneous and transitional spectral information in speaker recognition

  • Author

    Soong, Frank K. ; Rosenberg, Aaron E.

  • Author_Institution
    AT&T Bell Labs., Murray Hill, NJ, USA
  • Volume
    36
  • Issue
    6
  • fYear
    1988
  • fDate
    6/1/1988 12:00:00 AM
  • Firstpage
    871
  • Lastpage
    879
  • Abstract
    The use of instantaneous and transitional spectral representations of spoken utterances for speaker recognition is investigated. Linear-predictive-coding (LPC)-derived cepstral coefficients are used to represent instantaneous spectral information, and best linear fits of each cepstral coefficient over a specified time window are used to represent transitional information. An evaluation has been carried out using a database of isolated digit utterances over dialed-up telephone lines by 10 talkers. Two vector quantization (VQ) codebooks, instantaneous and transitional, were constructed from each speaker´s training utterances. The experimental results show that the instantaneous and transitional representations are relatively uncorrelated, thus providing complementary information for speaker recognition. A rectangular window of approximately 100 ms duration provides an effective estimate of the transitional spectral features for speaker recognition. Also, simple transmission channel variations are shown to affect both the instantaneous spectral representations and the corresponding recognition performance significantly, while the transitional representations and performance are relatively resistant
  • Keywords
    analogue-digital conversion; encoding; filtering and prediction theory; spectral analysis; speech recognition; LPC; best linear fits; cepstral coefficients; codebooks; dialed-up telephone lines; instantaneous spectral information; isolated digit utterances; linear predictive coding; simple transmission channel variations; speaker recognition; spoken utterances; transitional spectral information; vector quantization; Cepstral analysis; Data mining; Databases; Distortion measurement; Filter bank; Linear predictive coding; Speaker recognition; Spectral analysis; Speech analysis; Telephony;
  • fLanguage
    English
  • Journal_Title
    Acoustics, Speech and Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0096-3518
  • Type

    jour

  • DOI
    10.1109/29.1598
  • Filename
    1598