Title : 
Estimation of handset nonlinearity with application to speaker recognition
         
        
            Author : 
Quatieri, Thomas F. ; Reynolds, Douglas A. ; O´Leary, Gerald C.
         
        
            Author_Institution : 
Lincoln Lab., MIT, Lexington, MA, USA
         
        
        
        
        
            fDate : 
9/1/2000 12:00:00 AM
         
        
        
        
            Abstract : 
A method is described for estimating telephone handset nonlinearity by matching the spectral magnitude of the distorted signal to the output of a nonlinear channel model, driven by an undistorted reference. This “magnitude only” representation allows the model to directly match unwanted speech formants that arise over nonlinear channels and that are a potential source of degradation in speaker and speech recognition algorithms. As such, the method is particularly suited to algorithms that use only spectral magnitude information. The distortion model consists of a memoryless nonlinearity sandwiched between two finite-length linear filters. Nonlinearities considered include arbitrary finite-order polynomials and parametric sigmoidal functionals derived from a carbon-button handset model. Minimization of a mean-squared spectral magnitude distance with respect to model parameters relies on iterative estimation via a gradient descent technique. Initial work has demonstrated the importance of addressing handset nonlinearity, in addition to linear distortion, in speaker recognition over telephone channels. A nonlinear handset “mapping,” applied to training or testing data to reduce mismatch between different types of handset microphone outputs, improves speaker verification performance relative to linear compensation only. Finally, a method is proposed to merge the mapper strategy with a method of likelihood score normalization (hnorm) for further mismatch reduction and speaker verification performance improvement
         
        
            Keywords : 
digital filters; iterative methods; least mean squares methods; polynomials; spectral analysis; speech recognition; telephone sets; arbitrary finite-order polynomials; carbon-button handset model; degradation; distorted signal; distortion model; finite-length linear filters; gradient descent technique; handset nonlinearity; iterative estimation; likelihood score normalization; linear compensation; linear distortion; magnitude only representation; mapper strategy; mean-squared spectral magnitude distance; memoryless nonlinearity; mismatch reduction; nonlinear channel model; nonlinear channels; nonlinearities; parametric sigmoidal functionals; speaker recognition; speaker verification; speaker verification performance; spectral magnitude; spectral magnitude information; speech recognition algorithms; telephone handset nonlinearity; undistorted reference; unwanted speech formants; Degradation; Microphones; Nonlinear distortion; Nonlinear filters; Polynomials; Speaker recognition; Speech recognition; Telephone sets; Telephony; Testing;
         
        
        
            Journal_Title : 
Speech and Audio Processing, IEEE Transactions on