DocumentCode :
3641558
Title :
Effects of dimensionality reduction and feature selection in text classification
Author :
Osman Durmaz;Hasan Şakir Bilge
Author_Institution :
Bilgisayar Mü
fYear :
2011
fDate :
4/1/2011 12:00:00 AM
Firstpage :
21
Lastpage :
24
Abstract :
The goal of classifying text or generally data is to decrease the time of access to the information. Continuously increasing number of documents makes the classification process impossible to do manually. In this case, the automatic text classification systems are activated. In these systems, large data space is an important problem. By using dimensionality reduction techniques and feature selection in text classification systems, it is possible to do right classification with reduced size of data. In this study, Discrete Cosine Transform (DCT) method and the feature selection with Proportion of Variance method are proposed to get more effective results for classification results and short classification time is aimed. In experimental studies WebKB and R8 datasets in Reuters-21578 are used. By using DCT method classification success is highly preserved and with Proportion of Variance method classification success increase.
Keywords :
"Signal processing","Conferences","Helium"
Publisher :
ieee
Conference_Titel :
Signal Processing and Communications Applications (SIU), 2011 IEEE 19th Conference on
ISSN :
2165-0608
Print_ISBN :
978-1-4577-0462-8
Type :
conf
DOI :
10.1109/SIU.2011.5929577
Filename :
5929577
Link To Document :
بازگشت