DocumentCode
714361
Title
Two new feature extraction methods for text classification: TESDF and SADF
Author
Kilic, Erdal ; Ates, Nurullah ; Karakaya, Aykut ; Sahin, Durmus Ozkan
Author_Institution
Bilgisayar Muhendisligi Bolumu, Ondokuz Mayis Univ., Samsun, Turkey
fYear
2015
fDate
16-19 May 2015
Firstpage
475
Lastpage
478
Abstract
In this study, two new document weighting methods are proposed based on term frequency-inverse document frequency (TF-IDF) generally used in text mining methods. Also, insignificance of the verb in text classification which will be a new method in pre-processing have been put forward and tested. The better results were observed through using these methods when these methods compare with other method, It was observed that the performance rate hardly change and the data size which was processed decreased by omitting verbs of texts.
Keywords
document image processing; feature extraction; text analysis; SADF; TESDF; document weighting methods; feature extraction methods; term frequency-inverse document frequency; text classification; text mining methods; Automation; Conferences; Feature extraction; Niobium; Signal processing; Signal processing algorithms; Text categorization; inverse document frequency; term weighting; text classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing and Communications Applications Conference (SIU), 2015 23th
Conference_Location
Malatya
Type
conf
DOI
10.1109/SIU.2015.7129862
Filename
7129862
Link To Document