• DocumentCode
    714361
  • Title

    Two new feature extraction methods for text classification: TESDF and SADF

  • Author

    Kilic, Erdal ; Ates, Nurullah ; Karakaya, Aykut ; Sahin, Durmus Ozkan

  • Author_Institution
    Bilgisayar Muhendisligi Bolumu, Ondokuz Mayis Univ., Samsun, Turkey
  • fYear
    2015
  • fDate
    16-19 May 2015
  • Firstpage
    475
  • Lastpage
    478
  • Abstract
    In this study, two new document weighting methods are proposed based on term frequency-inverse document frequency (TF-IDF) generally used in text mining methods. Also, insignificance of the verb in text classification which will be a new method in pre-processing have been put forward and tested. The better results were observed through using these methods when these methods compare with other method, It was observed that the performance rate hardly change and the data size which was processed decreased by omitting verbs of texts.
  • Keywords
    document image processing; feature extraction; text analysis; SADF; TESDF; document weighting methods; feature extraction methods; term frequency-inverse document frequency; text classification; text mining methods; Automation; Conferences; Feature extraction; Niobium; Signal processing; Signal processing algorithms; Text categorization; inverse document frequency; term weighting; text classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing and Communications Applications Conference (SIU), 2015 23th
  • Conference_Location
    Malatya
  • Type

    conf

  • DOI
    10.1109/SIU.2015.7129862
  • Filename
    7129862