DocumentCode
3659672
Title
Author identification based on word distribution in word space
Author
Barathi Ganesh H B; Reshma U; Anand Kumar M
Author_Institution
Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore, India - 641112
fYear
2015
Firstpage
1519
Lastpage
1523
Abstract
Author attribution has grown into an area that is more challenging from the past decade. It has become an inevitable task in many sectors like forensic analysis, law, journalism and many more as it helps to detect the author in every documentation. Here unigram/bigram features along with latent semantic features from word space were taken and the similarity of a particular document was tested using Random forest tree, Logistic Regression and Support Vector Machine in order to create a global model. Dataset from PAN Author Identification shared task 2014 is taken for processing. It has been observed that the proposed model shows state-of-art accuracy of 80% which is significantly greater when compared to the Author Identification PAN results of the year 2014.
Keywords
"Feature extraction","Semantics","Support vector machines","Vegetation","Computational modeling","Logistics","Accuracy"
Publisher
ieee
Conference_Titel
Advances in Computing, Communications and Informatics (ICACCI), 2015 International Conference on
Print_ISBN
978-1-4799-8790-0
Type
conf
DOI
10.1109/ICACCI.2015.7275828
Filename
7275828
Link To Document