مرکز منطقه ای اطلاع رساني علوم و فناوري - Combination of pitch and MFCC GMM supervectors for speaker verification

DocumentCode :

2425113

Title :

Combination of pitch and MFCC GMM supervectors for speaker verification

Author :

Huang, Wei ; Chao, Jianshu ; Zhang, Yaxin

Author_Institution :

Motorola China Res. Center, Shanghai

fYear :

2008

fDate :

7-9 July 2008

Firstpage :

1335

Lastpage :

1339

Abstract :

A large majority of speaker verification systems are based on frame-level acoustic features, such as Mel frequency cepstral coefficients (MFCCs) which characterize the vocal tract contribution. The most commonly used statistical GMM-UBM classifier models the distribution of MFCCs quite well. Pitch is one of the most important features which characterize speaker-dependent vocal fold vibration rate. It can complement the vocal tract information as source information. Although the source information is supposed to follow a lognormal distribution, the discriminative support vector machine (SVM) is more suitable for pitch classification. In this paper, firstly we exploit GMM-UBM and SVM to the frame-level pitch vectors. Then we put the state-of-the-art GMM supervectors concept to the pitch feature vectors and experiment shows a promising result. And the combination of two feature type GMM supervectors systems gains much better performance. All experiment results are obtained on the NIST 2001 speaker database.

Keywords :

audio databases; cepstral analysis; speaker recognition; support vector machines; Mel frequency cepstral coefficients; frame-level acoustic features; pitch; speaker database; speaker verification; supervectors; support vector machine; vocal fold vibration rate; vocal tract contribution; vocal tract information; Gaussian distribution; Loudspeakers; Mel frequency cepstral coefficient; NIST; Spatial databases; Speaker recognition; Speech; Statistical distributions; Support vector machine classification; Support vector machines;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Audio, Language and Image Processing, 2008. ICALIP 2008. International Conference on

Conference_Location :

Shanghai

Print_ISBN :

978-1-4244-1723-0

Electronic_ISBN :

978-1-4244-1724-7

Type :

conf

DOI :

10.1109/ICALIP.2008.4590129

Filename :

4590129

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2425113