• DocumentCode
    2935845
  • Title

    Genre and author detection in Turkish texts using artificial immune recognition systems

  • Author

    Kaban, Zafer ; Diri, Banu

  • Author_Institution
    Bilgisayar Muhendisligi Bolumu, Yildiz Teknik Univ., Istanbul
  • fYear
    2008
  • fDate
    20-22 April 2008
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    This study is made for investigating the performance of artificial immune recognition systems on genre and author detection by using method referenced as (H. Yildiz et al., 2007) based on representation of a document in a different scheme. Most of the studies done nowadays depend on bag of words model which takes the roots or the stems of the words as features. This situation both increases the number of features and the classification time. In this study, the method we named as YTU is used which applies a weighting algorithm on word stems and decreases the number of features to the number of classes resulting in lower classification time and better performance. Artificial immune recognition algorithms AIRS1, AIRS2, AIRS2Parallel which we tested by this method increased the performance in genre and author detection. In the experimental results section of the paper the comparison of the classification performance of mostly used classifiers on author and genre detection naive Bayes (NB), support vector machine (SVM), random forest (RF), k-nearest neighbourhood (K-NN) and the artificial immune systems algorithms are presented. Especially in genre detection AIRS2Parallel classifier gives the highest performance of 99,6% with random forest and K-nearest neighbourhood. This shows that artificial immune recognition algorithms can be used in genre detection.
  • Keywords
    Bayes methods; natural language processing; pattern classification; support vector machines; text analysis; AIRS1; AIRS2Parallel; Turkish texts; YTU method; artificial immune recognition systems; author detection; bag of words model; classification performance; document representation; genre detection; k-nearest neighbourhood; naive Bayes; random forest; support vector machine; weighting algorithm; Artificial immune systems; Niobium; Proteins; Radio frequency; Robots; Support vector machine classification; Support vector machines; Testing; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing, Communication and Applications Conference, 2008. SIU 2008. IEEE 16th
  • Conference_Location
    Aydin
  • Print_ISBN
    978-1-4244-1998-2
  • Electronic_ISBN
    978-1-4244-1999-9
  • Type

    conf

  • DOI
    10.1109/SIU.2008.4632548
  • Filename
    4632548