DocumentCode :
2935845
Title :
Genre and author detection in Turkish texts using artificial immune recognition systems
Author :
Kaban, Zafer ; Diri, Banu
Author_Institution :
Bilgisayar Muhendisligi Bolumu, Yildiz Teknik Univ., Istanbul
fYear :
2008
fDate :
20-22 April 2008
Firstpage :
1
Lastpage :
4
Abstract :
This study is made for investigating the performance of artificial immune recognition systems on genre and author detection by using method referenced as (H. Yildiz et al., 2007) based on representation of a document in a different scheme. Most of the studies done nowadays depend on bag of words model which takes the roots or the stems of the words as features. This situation both increases the number of features and the classification time. In this study, the method we named as YTU is used which applies a weighting algorithm on word stems and decreases the number of features to the number of classes resulting in lower classification time and better performance. Artificial immune recognition algorithms AIRS1, AIRS2, AIRS2Parallel which we tested by this method increased the performance in genre and author detection. In the experimental results section of the paper the comparison of the classification performance of mostly used classifiers on author and genre detection naive Bayes (NB), support vector machine (SVM), random forest (RF), k-nearest neighbourhood (K-NN) and the artificial immune systems algorithms are presented. Especially in genre detection AIRS2Parallel classifier gives the highest performance of 99,6% with random forest and K-nearest neighbourhood. This shows that artificial immune recognition algorithms can be used in genre detection.
Keywords :
Bayes methods; natural language processing; pattern classification; support vector machines; text analysis; AIRS1; AIRS2Parallel; Turkish texts; YTU method; artificial immune recognition systems; author detection; bag of words model; classification performance; document representation; genre detection; k-nearest neighbourhood; naive Bayes; random forest; support vector machine; weighting algorithm; Artificial immune systems; Niobium; Proteins; Radio frequency; Robots; Support vector machine classification; Support vector machines; Testing; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing, Communication and Applications Conference, 2008. SIU 2008. IEEE 16th
Conference_Location :
Aydin
Print_ISBN :
978-1-4244-1998-2
Electronic_ISBN :
978-1-4244-1999-9
Type :
conf
DOI :
10.1109/SIU.2008.4632548
Filename :
4632548
Link To Document :
بازگشت