مرکز منطقه ای اطلاع رساني علوم و فناوري - Speaker identification with whispered speech based on modified LFCC parameters and feature mapping

DocumentCode :

3530154

Title :

Speaker identification with whispered speech based on modified LFCC parameters and feature mapping

Author :

Fan, Xing ; Hansen, John H L

Author_Institution :

Erik Jonsson Sch. of Eng. & Comput. Sci., Univ. of Texas at Dallas, Richardson, TX

fYear :

2009

fDate :

19-24 April 2009

Firstpage :

4553

Lastpage :

4556

Abstract :

Much research recently in speaker recognition has been devoted to robustness due to microphone and channel effects. However, changes in vocal effort, especially whispered speech, present significant challenges in maintaining system performance. Due to the absence of any periodic excitation in whisper, the spectral structure in whisper and neutral speech will differ. Therefore, performance of speaker ID systems, trained mainly with high energy voiced phonemes, degrades when tested with whisper. This study considers a front-end feature compensation method for whispered speech to improve speaker recognition using a neutral trained system. First, an alternative feature vector with linear frequency cepstral coefficients (LFCC) is introduced based on spectral analysis from both speech modes. Next, for the first time a feature mapping is proposed for reducing whisper/neutral mismatch in speaker ID. Feature mapping is applied on a frame-by-frame basis between two speaker independent GMMs (Gaussian Mixture Models) of whispered and neutral speech. Text independent closed set speaker ID results show an absolute 20% improvement in accuracy when compared with a traditional MFCC feature based system. This result confirms a viable approach to improving speaker ID performance between neutral and whispered speech conditions.

Keywords :

Gaussian processes; cepstral analysis; learning (artificial intelligence); speaker recognition; Gaussian mixture model; feature vector mapping; frame-by-frame basis; front-end feature compensation method; linear frequency cepstral coefficient; modified LFCC parameter; neutral trained system; speaker ID system; speaker identification; speaker independent GMM; speaker recognition; spectral analysis; spectral structure; whispered speech recognition; Degradation; Frequency; Microphones; Periodic structures; Robustness; Speaker recognition; Speech; System performance; System testing; Vectors; cepstrum coefficients; feature mapping; feature mappingwhisper; linear scale; linear scale cepstrum coefficients; speaker identification; whisper;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on

Conference_Location :

Taipei

ISSN :

1520-6149

Print_ISBN :

978-1-4244-2353-8

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2009.4960643

Filename :

4960643

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3530154