Title :
PNCC-ivector-SRC based speaker verification
Author :
Ambikairajah, E. ; Kua, J.M.K. ; Sethu, Vidhyasaharan ; Haizhou Li
Author_Institution :
Sch. of Electr. Eng. & Telecommun., Univ. of New South Wales, Sydney, NSW, Australia
Abstract :
Most conventional features used in speaker recognition are based on Mel Frequency Cepstral Coefficients (MFCC) or Perceptual Linear Prediction (PLP) coefficients. Recently, the Power Normalised Cepstral Coefficients (PNCC) which are computed based on auditory processing, have been proposed as an alternative feature to MFCC for robust speech recognition. The objective of this paper is to investigate the speaker verification performance of PNCC features with a Sparse Representation Classifier (SRC), using a mixture of ℓ1 and ℓ2 norms. The paper also explores the score level fusion of both MFCC and PNCC i-vector based speaker verification systems. Evaluations on the NIST 2010 SRE extended database show that the fusion of MFCC-SRC and PNCC-SRC gave the best performance with a DCF of 0.4977. Further, cosine distance scoring (CDS) based systems were also investigated and the fusion of MFCC-CDS and PNCC-CDS presented an improvement in terms of EER, from a 3.99% EER baseline to 3.55%.
Keywords :
audio signal processing; cepstral analysis; speaker recognition; CDS based system; DCF; EER; MFCC-SRC; PNCC i-vector SRC based speaker verification; auditory processing; cosine distance scoring; mel frequency cepstral coefficient; perceptual linear prediction coefficient; power normalised cepstral coefficient; robust speech recognition; score level fusion; sparse representation classifier; speaker recognition; Dictionaries; Feature extraction; Mel frequency cepstral coefficient; Smoothing methods; Support vector machine classification; Training; Vectors;
Conference_Titel :
Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific
Conference_Location :
Hollywood, CA
Print_ISBN :
978-1-4673-4863-8