DocumentCode :
1370101
Title :
Robust speaker recognition: a feature-based approach
Author :
Mammone, Richard J. ; Zhang, Xiaoyu ; Ramachandran, Ravi P.
Volume :
13
Issue :
5
fYear :
1996
Firstpage :
58
Abstract :
The future commercialization of speaker- and speech-recognition technology is impeded by the large degradation in system performance due to environmental differences between training and testing conditions. This is known as the "mismatched condition." Studies have shown [l] that most contemporary systems achieve good recognition performance if the conditions during training are similar to those during operation (matched conditions). Frequently, mismatched conditions axe present in which the performance is dramatically degraded as compared to the ideal matched conditions. A common example of this mismatch is when training is done on clean speech and testing is performed on noise- or channel-corrupted speech. Robust speech techniques [2] attempt to maintain the performance of a speech processing system under such diverse conditions of operation. This article presents an overview of current speaker-recognition systems and the problems encountered in operation, and it focuses on the front-end feature extraction process of robust speech techniques as a method of improvement. Linear predictive (LP) analysis, the first step of feature extraction, is discussed, and various robust cepstral features derived from LP coefficients are described. Also described is the afJine transform, which is a feature transformation approach that integrates mismatch to simultaneously combat both channel and noise distortion.
Keywords :
Acoustic noise; Cepstral analysis; Data mining; Degradation; Feature extraction; Noise robustness; Speaker recognition; Speech processing; Speech recognition; System testing;
fLanguage :
English
Journal_Title :
Signal Processing Magazine, IEEE
Publisher :
ieee
ISSN :
1053-5888
Type :
jour
DOI :
10.1109/79.536825
Filename :
536825
Link To Document :
بازگشت