Title :
Linear and Non-Linear Dimensional Reduction via Class Representatives for Text Classification
Author :
Zeimpekis, Dimitrios ; Gallopoulos, Efstratios
Author_Institution :
Comput. Eng. & Inf. Dept., Univ. of Patras, Patras
Abstract :
We address the problem of building fast and effective text classification tools. We describe a "representatives methodology" related to feature extraction and illustrate its performance using as vehicles a centroid based method and a method based on clustered LSI that were recently proposed as useful tools for low rank matrix approximation and cost effective alternatives to LSI. The methodology is very flexible, providing the means for accelerating existing algorithms. It is also combined with kernel techniques to enable the analysis of data for which linear techniques are insufficient. Numerous classification examples indicate that the proposed technique is effective and efficient with an overall performance superior than existing linear and nonlinear LSI-based approaches.
Keywords :
approximation theory; classification; feature extraction; text analysis; class representatives; feature extraction; matrix approximation; nonlinear dimensional reduction; representatives methodology; text classification; Automotive engineering; Clustering algorithms; Costs; Feature extraction; Informatics; Kernel; Large scale integration; Testing; Text categorization; Vehicles;
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2701-7
DOI :
10.1109/ICDM.2006.98