• DocumentCode
    2179944
  • Title

    Robust speaker identification using a CASA front-end

  • Author

    Zhao, Xiaojia ; Shao, Yang ; Wang, DeLiang

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    5468
  • Lastpage
    5471
  • Abstract
    Speaker recognition remains a challenging task under noisy conditions. Inspired by auditory perception, computational auditory scene analysis (CASA) typically segregates speech by producing a binary time-frequency mask. We first show that a recently introduced speaker feature, Gammatone Frequency Cepstral Coefficient, performs substantially better than conventional speaker features under noisy conditions. To deal with noisy speech, we apply CASA separation and then either reconstruct or marginalize corrupted components indicated by the CASA mask. Both methods are effective. We further combine them into a single system depending on the detected signal to noise ratio (SNR). This system achieves significant performance improvements over related systems under a wide range of SNR conditions.
  • Keywords
    hearing; speaker recognition; CASA front-end; Gammatone frequency cepstral coefficient; SNR; auditory perception; binary time-frequency mask; computational auditory scene analysis; robust speaker identification; signal to noise ratio; Cepstral analysis; Noise measurement; Robustness; Signal to noise ratio; Speaker recognition; Speech; CASA; GFCC; Robust speaker identification; gammatone frequency cepstral coefficient; ideal binary mask;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947596
  • Filename
    5947596