• DocumentCode
    3694423
  • Title

    An experimental investigation on PCA based on cosine similarity and correlation for text feature dimensionality reduction

  • Author

    Maysa I Abdulhussain;John Q Gan

  • Author_Institution
    School of Computer Science and Electronic Engineering, University of Essex Colchester, Essex CO4 3SQ, UK
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Principal component analysis (PCA) is a commonly used method for feature extraction and dimensionality reduction. This paper proposes PCA based on similarity/correlation criteria instead of covariance to gain low-dimensional features with high performance in text classification. Experimental results have demonstrated the advantages and usefulness of the proposed method in text classification in high-dimensional feature space, in terms of the number of features required to achieve the best classification accuracy.
  • Keywords
    "Principal component analysis","Covariance matrices","Correlation","Accuracy","Electronic mail","Computer science","Support vector machines"
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Electronic Engineering Conference (CEEC), 2015 7th
  • Type

    conf

  • DOI
    10.1109/CEEC.2015.7332689
  • Filename
    7332689