Title :
Combination of pitch and MFCC GMM supervectors for speaker verification
Author :
Huang, Wei ; Chao, Jianshu ; Zhang, Yaxin
Author_Institution :
Motorola China Res. Center, Shanghai
Abstract :
A large majority of speaker verification systems are based on frame-level acoustic features, such as Mel frequency cepstral coefficients (MFCCs) which characterize the vocal tract contribution. The most commonly used statistical GMM-UBM classifier models the distribution of MFCCs quite well. Pitch is one of the most important features which characterize speaker-dependent vocal fold vibration rate. It can complement the vocal tract information as source information. Although the source information is supposed to follow a lognormal distribution, the discriminative support vector machine (SVM) is more suitable for pitch classification. In this paper, firstly we exploit GMM-UBM and SVM to the frame-level pitch vectors. Then we put the state-of-the-art GMM supervectors concept to the pitch feature vectors and experiment shows a promising result. And the combination of two feature type GMM supervectors systems gains much better performance. All experiment results are obtained on the NIST 2001 speaker database.
Keywords :
audio databases; cepstral analysis; speaker recognition; support vector machines; Mel frequency cepstral coefficients; frame-level acoustic features; pitch; speaker database; speaker verification; supervectors; support vector machine; vocal fold vibration rate; vocal tract contribution; vocal tract information; Gaussian distribution; Loudspeakers; Mel frequency cepstral coefficient; NIST; Spatial databases; Speaker recognition; Speech; Statistical distributions; Support vector machine classification; Support vector machines;
Conference_Titel :
Audio, Language and Image Processing, 2008. ICALIP 2008. International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-1723-0
Electronic_ISBN :
978-1-4244-1724-7
DOI :
10.1109/ICALIP.2008.4590129