• DocumentCode
    3430643
  • Title

    Vocal source features for bilingual speaker identification

  • Author

    Jianglin Wang ; Johnson, Matthew Thomas

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Marquette Univ., Milwaukee, WI, USA
  • fYear
    2013
  • fDate
    6-10 July 2013
  • Firstpage
    170
  • Lastpage
    173
  • Abstract
    This paper introduces the use of two new features for speaker identification, Residual Phase Cepstrum Coefficients (RPCC) and Glottal Flow Cepstrum Coefficients (GLFCC), to capture speaker-specific characteristics from their vocal excitation patterns. Results on a cross-lingual speaker identification task taken from the NIST 2004 SRE demonstrate that these RPCC and GLFCC features are significantly more accurate than traditional mel-frequency cepstral coefficients (MFCC). In particular, these two new features give better results with smaller amounts of training data, due to lower model complexity.
  • Keywords
    Gaussian processes; maximum likelihood estimation; natural language processing; speaker recognition; GLFCC; GMM-UBM; Gaussian mixture model- universal background model; NIST 2004 SRE; RPCC; bilingual speaker identification; cross-lingual speaker identification task; glottal flow cepstrum coefficients; maximum a posteriori adaptation; model complexity; residual phase cepstrum coefficients; speaker-specific characteristics; vocal excitation patterns; vocal source features; Accuracy; Adaptation models; Feature extraction; Filtering; Mel frequency cepstral coefficient; Speaker recognition; Speech; Glottal source excitation; IAIF and GMM; Speaker identification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal and Information Processing (ChinaSIP), 2013 IEEE China Summit & International Conference on
  • Conference_Location
    Beijing
  • Type

    conf

  • DOI
    10.1109/ChinaSIP.2013.6625321
  • Filename
    6625321