DocumentCode :
3492632
Title :
Discriminant Kernels derived from the optimum nonlinear discriminant analysis
Author :
Kurita, Takio
Author_Institution :
Hiroshima Univ., Hiroshima, Japan
fYear :
2011
fDate :
July 31 2011-Aug. 5 2011
Firstpage :
299
Lastpage :
306
Abstract :
Linear discriminant analysis (LDA) is one of the well known methods to extract the best features for multi-class discrimination. Recently Kernel discriminant analysis (KDA) has been successfully applied in many applications. KDA is one of the nonlinear extensions of LDA and construct nonlinear discriminant mapping by using kernel functions. But the kernel function is usually defined a priori and it is not known what the optimum kernel function for nonlinear discriminant analysis is. Also the class information is not usually introduced to define the kernel functions. In this paper the optimum kernel function in terms of the discriminant criterion is derived by investigating the optimum discriminant mapping constructed by the optimum nonlinear discriminant analysis (ONDA). Otsu derived the optimum nonlinear discriminant analysis (ONDA) by assuming the underlying probabilities similar with the Bayesian decision theory. He showed that the optimum non linear discriminant mapping was obtained by using Variational Calculus. The optimum nonlinear discriminant mapping can be defined as a linear combination of the Bayesian a posterior probabilities and the coefficients of the linear combination are obtained by solving the eigenvalue problem of the matrices defined by using the Bayesian a posterior probabilities. This means that the ONDA is closely related to Bayesian decision theory. Also Otsu showed that LDA could be interpreted as a linear approximation of the ONDA through the linear approximation of the Bayesian a posterior probabilities. In this paper, the optimum kernel function is derived by investigating the optimum discriminant mapping constructed by ONDA. The derived kernel function is also given by using the Bayesian a posterior probabilities. This means that the class information is naturally introduced in the kernel function. For real application, we can define a family of discriminate kernel functions can be defined by changing the estimation method of the Bayesi- - an a posterior probabilities.
Keywords :
Bayes methods; decision theory; eigenvalues and eigenfunctions; matrix algebra; variational techniques; Bayesian a posterior probabilities; Bayesian decision theory; class information; discriminant criterion; discriminant kernels; eigenvalue problem; feature extraction; kernel discriminant analysis; linear approximation; matrices; multiclass discrimination; optimum kernel function; optimum nonlinear discriminant analysis; optimum nonlinear discriminant mapping; variational calculus; Bayesian methods; Covariance matrix; Decision theory; Eigenvalues and eigenfunctions; Kernel; Linear approximation; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), The 2011 International Joint Conference on
Conference_Location :
San Jose, CA
ISSN :
2161-4393
Print_ISBN :
978-1-4244-9635-8
Type :
conf
DOI :
10.1109/IJCNN.2011.6033235
Filename :
6033235
Link To Document :
بازگشت