Title :
Fusion strategies for distributed speaker recognition using residual signal based G729 resynthesized speech
Author :
Yessad, Dalila ; Amrouche, Abderrahmane
Author_Institution :
Speech Commun. & Signal Process. Lab., USTHB, Algiers, Algeria
Abstract :
With the development of VoIP (Voice over IP) service, there is an emerging need to speech compression, particularly for digital speech communication and biometric speaker recognition (SR) systems. This paper presents results issued from Universal Background Gaussian Mixture Model (GMM UBM) based SR system, that is trained and tested on clean and G729 resynthesized speech. To overcome the performance loss due to the G729 codec, residual signal extracted from clean and G729 resynthesized database is used. To get better the performance, we investigated score fusion strategies based on Logistic Regression (LR). The first fusion based on GMM UBM score using LFCC (Linear Frequency Cepstrum Coefficients) and LFCC extracted from LP (Linear Prediction) residual signal. The second used the LFCC extracted from G729 resynthesized speech and its LP residual signal. The best performance is obtained by Logistic Regression (LR) fusion. The correct rate in the first case is 95% based baseline system and 83% based G729 resynthesized speech in the second case. The obtained results, using TIMIT database, have proven the efficiency of data fusion techniques for automatic speaker recognition.
Keywords :
Gaussian processes; cepstral analysis; feature extraction; prediction theory; regression analysis; sensor fusion; speaker recognition; speech codecs; speech synthesis; G729 codec; G729 resynthesized database; G729 resynthesized speech; GMM UBM based SR system; GMM UBM score; LFCC; LP residual signal; LR fusion; SR systems; TIMIT database; VoIP; automatic speaker recognition; biometric speaker recognition; data fusion techniques; digital speech communication; distributed speaker recognition; linear frequency cepstrum coefficients; linear prediction residual signal; logistic regression; residual signal extraction; score fusion strategies; speech compression; universal background Gaussian mixture model; voice over IP service; Databases; Feature extraction; Logistics; Speaker recognition; Speech; Speech coding; Speech recognition; G729; LP residual; Logistic Regression (LR) fusion; Score fusion; VoIP;
Conference_Titel :
Information Fusion (FUSION), 2013 16th International Conference on
Conference_Location :
Istanbul
Print_ISBN :
978-605-86311-1-3