• DocumentCode
    2790435
  • Title

    Acoustic analysis for speaker identification of whispered speech

  • Author

    Fan, Xing ; Hansen, John H L

  • Author_Institution
    Center for Robust Speech Syst. (CRSS), Univ. of Texas at Dallas, Richardson, TX, USA
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    5046
  • Lastpage
    5049
  • Abstract
    Whisper is an alternative speech production mode from neutral speech, which is used by talkers intentionally in natural conversational scenarios to protect personal privacy and avoid being overheard. Due to differences between whispered and neutral speech in vocal excitation and vocal tract function, the performance of speaker ID systems trained with neutral speech degrades significantly. In this study, a neutral trained closed-set speaker ID task based on MFCC-GMM is considered. It is observed that for whisper speaker recognition, the degradation is concentrated for a certain number of speakers. Next, an acoustic analysis is conducted in order to determine the reason affecting the degradation for those speakers. Finally, a confidence space is proposed to measure the quality of whispered speech for the task of speaker ID. Experimental evaluations demonstrate the effectiveness of this method in searching whispered utterances with poor speaker information for a neutral/whisper mismatch speaker ID system. The proposed method makes it possible to compensate for those poor utterances, meanwhile avoiding any harm to other utterances that remain the performance of neutral speaker ID task.
  • Keywords
    acoustic signal processing; bioacoustics; pattern recognition; speech; speech processing; MFCC-GMM; acoustic analysis; mel-frequency cepstral coefficients; neutral speech trained closed set speaker ID task; neutral-whisper speaker ID mismatch; whisper speaker recognition; whispered speech quality measurement; whispered speech speaker identification; whispered speech vocal excitation; whispered speech vocal tract function; Loudspeakers; Speech analysis; MFCC; Whispered speech; speaker indetification; speaker information;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495059
  • Filename
    5495059