DocumentCode
2935845
Title
Genre and author detection in Turkish texts using artificial immune recognition systems
Author
Kaban, Zafer ; Diri, Banu
Author_Institution
Bilgisayar Muhendisligi Bolumu, Yildiz Teknik Univ., Istanbul
fYear
2008
fDate
20-22 April 2008
Firstpage
1
Lastpage
4
Abstract
This study is made for investigating the performance of artificial immune recognition systems on genre and author detection by using method referenced as (H. Yildiz et al., 2007) based on representation of a document in a different scheme. Most of the studies done nowadays depend on bag of words model which takes the roots or the stems of the words as features. This situation both increases the number of features and the classification time. In this study, the method we named as YTU is used which applies a weighting algorithm on word stems and decreases the number of features to the number of classes resulting in lower classification time and better performance. Artificial immune recognition algorithms AIRS1, AIRS2, AIRS2Parallel which we tested by this method increased the performance in genre and author detection. In the experimental results section of the paper the comparison of the classification performance of mostly used classifiers on author and genre detection naive Bayes (NB), support vector machine (SVM), random forest (RF), k-nearest neighbourhood (K-NN) and the artificial immune systems algorithms are presented. Especially in genre detection AIRS2Parallel classifier gives the highest performance of 99,6% with random forest and K-nearest neighbourhood. This shows that artificial immune recognition algorithms can be used in genre detection.
Keywords
Bayes methods; natural language processing; pattern classification; support vector machines; text analysis; AIRS1; AIRS2Parallel; Turkish texts; YTU method; artificial immune recognition systems; author detection; bag of words model; classification performance; document representation; genre detection; k-nearest neighbourhood; naive Bayes; random forest; support vector machine; weighting algorithm; Artificial immune systems; Niobium; Proteins; Radio frequency; Robots; Support vector machine classification; Support vector machines; Testing; Text recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing, Communication and Applications Conference, 2008. SIU 2008. IEEE 16th
Conference_Location
Aydin
Print_ISBN
978-1-4244-1998-2
Electronic_ISBN
978-1-4244-1999-9
Type
conf
DOI
10.1109/SIU.2008.4632548
Filename
4632548
Link To Document