Title :
Maximizing information content in feature extraction
Author :
Padmanabhan, Mukund ; Dharanipragada, Satya
Author_Institution :
Renaissance Technol., East Setauket, NY, USA
fDate :
7/1/2005 12:00:00 AM
Abstract :
In this paper, we consider the problem of quantifying the amount of information contained in a set of features, to discriminate between various classes. We explore these ideas in the context of a speech recognition system, where an important classification sub-problem is to predict the phonetic class, given an observed acoustic feature vector. The connection between information content and speech recognition system performance is first explored in the context of various feature extraction schemes used in speech recognition applications. Subsequently, the idea of optimizing the information content to improve recognition accuracy is generalized to a linear projection of the underlying features. We show that several prior methods to compute linear transformations (such as linear/heteroscedastic discriminant analysis) can be interpreted in this general framework of maximizing the information content. Subsequently, we extend this reasoning and propose a new objective function to maximize a penalized mutual information (pMI) measure. This objective function is seen to be very well correlated with the word error rate of the final system. Finally experimental results are provided that show that the proposed pMI projection consistently outperforms other methods for a variety of cases, leading to relative improvements in the word error rate of 5%-16% over earlier methods.
Keywords :
feature extraction; speech processing; speech recognition; feature extraction; heteroscedastic discriminant analysis; information content maximization; linear discriminant analysis; observed acoustic feature vector; phonetic class; speech recognition system; Auditory system; Cepstral analysis; Data mining; Error analysis; Feature extraction; Humans; Mutual information; Speech recognition; System performance; Vectors; Classifiers; optimal feature projections; optimum feature extraction; penalized mutual information; speech recognition;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
DOI :
10.1109/TSA.2005.848876