Title : 
Experimental evaluation of features for robust speaker identification
         
        
            Author : 
Reynolds, Douglas A.
         
        
            Author_Institution : 
Lincoln Lab., MIT, Lexington, MA, USA
         
        
        
        
        
            fDate : 
10/1/1994 12:00:00 AM
         
        
        
        
            Abstract : 
This correspondence presents an experimental evaluation of different features and channel compensation techniques for robust speaker identification. The goal is to keep all processing and classification steps constant and to vary only the features and compensations used to allow a controlled comparison. A general, maximum-likelihood classifier based on Gaussian mixture densities is used as the classifier, and experiments are conducted on the King speech database, a conversational, telephone-speech database. The features examined are mel-frequency and linear-frequency filterbank cepstral coefficients, linear prediction cepstral coefficients, and perceptual linear prediction (PLP) cepstral coefficients. The channel compensation techniques examined are cepstral mean removal, RASTA processing, and a quadratic trend removal technique. It is shown for this database that performance differences between the basic features is small, and the major gains are due to the channel compensation techniques. The best “across-the-divide” recognition accuracy of 92% is obtained for both high-order LPC features and band-limited filterbank features
         
        
            Keywords : 
filtering and prediction theory; linear predictive coding; maximum likelihood estimation; speech coding; speech recognition; Gaussian mixture densities; King speech database; RASTA processing; band-limited filterbank features; cepstral mean removal; channel compensation; database; experimental evaluation; high-order LPC features; linear-frequency filterbank; maximum-likelihood classifier; mel-frequency coefficients; perceptual linear prediction; quadratic trend removal; recognition accuracy; robust speaker identification; speech classification; speech processing; telephone-speech database; Cepstral analysis; Filter bank; Mel frequency cepstral coefficient; Performance gain; Robustness; Spatial databases; Speaker recognition; Speech analysis; Speech recognition; Transaction databases;
         
        
        
            Journal_Title : 
Speech and Audio Processing, IEEE Transactions on