DocumentCode
3694423
Title
An experimental investigation on PCA based on cosine similarity and correlation for text feature dimensionality reduction
Author
Maysa I Abdulhussain;John Q Gan
Author_Institution
School of Computer Science and Electronic Engineering, University of Essex Colchester, Essex CO4 3SQ, UK
fYear
2015
Firstpage
1
Lastpage
4
Abstract
Principal component analysis (PCA) is a commonly used method for feature extraction and dimensionality reduction. This paper proposes PCA based on similarity/correlation criteria instead of covariance to gain low-dimensional features with high performance in text classification. Experimental results have demonstrated the advantages and usefulness of the proposed method in text classification in high-dimensional feature space, in terms of the number of features required to achieve the best classification accuracy.
Keywords
"Principal component analysis","Covariance matrices","Correlation","Accuracy","Electronic mail","Computer science","Support vector machines"
Publisher
ieee
Conference_Titel
Computer Science and Electronic Engineering Conference (CEEC), 2015 7th
Type
conf
DOI
10.1109/CEEC.2015.7332689
Filename
7332689
Link To Document