DocumentCode :
3537706
Title :
Extending Linear Discriminant Analysis by Using Unlabeled Data
Author :
Lee, Young Tae ; Shin, Yong Joon ; Park, Cheong Hee
Author_Institution :
Dept. of Comput. Sci. & Eng., Chungnam Nat. Univ., Daejeon, South Korea
fYear :
2011
fDate :
Aug. 31 2011-Sept. 2 2011
Firstpage :
557
Lastpage :
562
Abstract :
Linear discriminant analysis(LDA) is a traditional dimension reduction method which finds projective directions to maximize separability between classes. However, when the number of labeled data points is small, the performance of LDA is degraded severely. In this paper, we propose two improved methods for LDA which utilizes abundant unlabeled data. Instead of using all the unlabeled data as in most of semi-supervised dimension reduction methods, we select confident unlabeled data and develop extended LDA algorithms. In the first method, a graph-based LDA method is developed to utilize confidence scores for chosen unlabeled data so that unlabeled data with a low confidence score contributes smaller than unlabeled data with a high confidence score. In the second method, selected unlabeled data points are used to modify the centroids of classes in an objective function of LDA. Extensive experimental results in text classification demonstrates the effectiveness of the proposed methods compared with other semi-supervised dimension reduction methods.
Keywords :
data reduction; graph theory; statistical analysis; text analysis; confidence scores; graph based LDA method; labeled data points; linear discriminant analysis; semisupervised dimension reduction methods; text classification; unlabeled data; Accuracy; Computers; Eigenvalues and eigenfunctions; Principal component analysis; Training; Training data; Vectors; Linear discriminant analysis; Semi-supervised dimension reduction; Text classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Information Technology (CIT), 2011 IEEE 11th International Conference on
Conference_Location :
Pafos
Print_ISBN :
978-1-4577-0383-6
Electronic_ISBN :
978-0-7695-4388-8
Type :
conf
DOI :
10.1109/CIT.2011.52
Filename :
6036825
Link To Document :
بازگشت