DocumentCode
980668
Title
Feature Extraction and Uncorrelated Discriminant Analysis for High-Dimensional Data
Author
Yang, Wen-Hui ; Dai, Dao-Qing ; Yan, Hong
Author_Institution
Sun Yat-Sen Univ., Guangzhou
Volume
20
Issue
5
fYear
2008
fDate
5/1/2008 12:00:00 AM
Firstpage
601
Lastpage
614
Abstract
High-dimensional data and the small sample size problem occur in many modern pattern classification applications such as face recognition and gene expression data analysis. To deal with such data, one important step is dimensionality reduction. Principal component analysis (PCA) and between-group analysis (BGA) are two commonly used methods, and various extensions of these two methods exist. The principle of these two approaches comes from their best approximation property. From a pattern recognition perspective, we show that PCA, which is based on the total scatter matrix, preserves linear separability, and BGA, which is based on between-class scatter matrix, retains the distance between class centroids. Moreover, we propose an automatic nonparameter uncorrelated discriminant analysis (UDA) algorithm based on the maximum margin criterion (MMC). The extracted features via UDA are statistically uncorrelated. UDA combines rank-preserving dimensionality reduction and constraint discriminant analysis and also serves as an effective solution for the small-sample-size problem. Experiments with face images and gene expression data sets are conducted to evaluate UDA in terms of classification accuracy and robustness.
Keywords
data analysis; data reduction; feature extraction; matrix algebra; pattern classification; principal component analysis; sampling methods; automatic nonparameter uncorrelated discriminant analysis algorithm; between-class scatter matrix; between-group analysis; constraint discriminant analysis; feature extraction; high-dimensional data analysis; maximum margin criterion; pattern classification application; pattern recognition; principal component analysis; rank-preserving dimensionality reduction; sample size problem; total scatter matrix; Feature extraction or construction; Pattern Recognition;
fLanguage
English
Journal_Title
Knowledge and Data Engineering, IEEE Transactions on
Publisher
ieee
ISSN
1041-4347
Type
jour
DOI
10.1109/TKDE.2007.190720
Filename
4384485
Link To Document