• DocumentCode
    980668
  • Title

    Feature Extraction and Uncorrelated Discriminant Analysis for High-Dimensional Data

  • Author

    Yang, Wen-Hui ; Dai, Dao-Qing ; Yan, Hong

  • Author_Institution
    Sun Yat-Sen Univ., Guangzhou
  • Volume
    20
  • Issue
    5
  • fYear
    2008
  • fDate
    5/1/2008 12:00:00 AM
  • Firstpage
    601
  • Lastpage
    614
  • Abstract
    High-dimensional data and the small sample size problem occur in many modern pattern classification applications such as face recognition and gene expression data analysis. To deal with such data, one important step is dimensionality reduction. Principal component analysis (PCA) and between-group analysis (BGA) are two commonly used methods, and various extensions of these two methods exist. The principle of these two approaches comes from their best approximation property. From a pattern recognition perspective, we show that PCA, which is based on the total scatter matrix, preserves linear separability, and BGA, which is based on between-class scatter matrix, retains the distance between class centroids. Moreover, we propose an automatic nonparameter uncorrelated discriminant analysis (UDA) algorithm based on the maximum margin criterion (MMC). The extracted features via UDA are statistically uncorrelated. UDA combines rank-preserving dimensionality reduction and constraint discriminant analysis and also serves as an effective solution for the small-sample-size problem. Experiments with face images and gene expression data sets are conducted to evaluate UDA in terms of classification accuracy and robustness.
  • Keywords
    data analysis; data reduction; feature extraction; matrix algebra; pattern classification; principal component analysis; sampling methods; automatic nonparameter uncorrelated discriminant analysis algorithm; between-class scatter matrix; between-group analysis; constraint discriminant analysis; feature extraction; high-dimensional data analysis; maximum margin criterion; pattern classification application; pattern recognition; principal component analysis; rank-preserving dimensionality reduction; sample size problem; total scatter matrix; Feature extraction or construction; Pattern Recognition;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2007.190720
  • Filename
    4384485